file fields
The file endpoint uses a file attributes first schema to enable faster search, but contains all the same data as the subject endpoint.
Column names that have a .
between words denote that the term after the .
is a nested field. Nesting structure can be more easily browsed in the file JSON schema
column_name | description | data_ty[e |
---|---|---|
ResearchSubject | A research subject is the entity of interest in a specific research study or project, typically a human being or an animal, but can also be a device, group of humans or animals, or a tissue sample. Human research subjects are usually not traceable to a particular person to protect the subject’s privacy. This entity plays the role of the case_id in existing data. | RECORD |
ResearchSubject.Diagnosis | A collection of characteristics that describe an abnormal condition of the body as assessed at a point in time. May be used to capture information about neoplastic and non-neoplastic conditions. | RECORD |
ResearchSubject.Diagnosis.Treatment | Represent medication administration or other treatment types. | RECORD |
ResearchSubject.Diagnosis.Treatment.days_to_treatment_end | The timepoint at which the treatment ended. | INTEGER |
ResearchSubject.Diagnosis.Treatment.days_to_treatment_start | The timepoint at which the treatment started. | INTEGER |
ResearchSubject.Diagnosis.Treatment.id | The 'logical' identifier of the entity in the repository, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. | STRING |
ResearchSubject.Diagnosis.Treatment.identifier | A 'business' identifier or accession number for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). | RECORD |
ResearchSubject.Diagnosis.Treatment.identifier.system | The system or namespace that defines the identifier. | STRING |
ResearchSubject.Diagnosis.Treatment.identifier.value | The value of the identifier, as defined by the system. | STRING |
ResearchSubject.Diagnosis.Treatment.number_of_cycles | The number of treatment cycles the subject received. | INTEGER |
ResearchSubject.Diagnosis.Treatment.therapeutic_agent | One or more therapeutic agents as part of this treatment. | STRING |
ResearchSubject.Diagnosis.Treatment.treatment_anatomic_site | The anatomical site that the treatment targets. | STRING |
ResearchSubject.Diagnosis.Treatment.treatment_effect | The effect of a treatment on the diagnosis or tumor. | STRING |
ResearchSubject.Diagnosis.Treatment.treatment_end_reason | The reason the treatment ended. | STRING |
ResearchSubject.Diagnosis.Treatment.treatment_outcome | The final outcome of the treatment. | STRING |
ResearchSubject.Diagnosis.Treatment.treatment_type | The treatment type including medication/therapeutics or other procedures. | STRING |
ResearchSubject.Diagnosis.age_at_diagnosis | The age in days of the individual at the time of diagnosis. | INTEGER |
ResearchSubject.Diagnosis.grade | The degree of abnormality of cancer cells, a measure of differentiation, the extent to which cancer cells are similar in appearance and function to healthy cells of the same tissue type. The degree of differentiation often relates to the clinical behavior of the particular tumor. Based on the microscopic findings, tumor grade is commonly described by one of four degrees of severity. Histopathologic grade of a tumor may be used to plan treatment and estimate the future course, outcome, and overall prognosis of disease. Certain types of cancers, such as soft tissue sarcoma, primary brain tumors, lymphomas, and breast have special grading systems. | STRING |
ResearchSubject.Diagnosis.id | The 'logical' identifier of the entity in the repository, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. | STRING |
ResearchSubject.Diagnosis.identifier | A 'business' identifier or accession number for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). | RECORD |
ResearchSubject.Diagnosis.identifier.system | The system or namespace that defines the identifier. | STRING |
ResearchSubject.Diagnosis.identifier.value | The value of the identifier, as defined by the system. | STRING |
ResearchSubject.Diagnosis.method_of_diagnosis | The method used to confirm the subjects malignant diagnosis. | STRING |
ResearchSubject.Diagnosis.morphology | Code that represents the histology of the disease using the third edition of the International Classification of Diseases for Oncology, published in 2000, used principally in tumor and cancer registries for coding the site (topography) and the histology (morphology) of neoplasms. | STRING |
ResearchSubject.Diagnosis.primary_diagnosis | The diagnosis instance that qualified a subject for inclusion on a ResearchProject. | STRING |
ResearchSubject.Diagnosis.stage | The extent of a cancer in the body. Staging is usually based on the size of the tumor, whether lymph nodes contain cancer, and whether the cancer has spread from the original site to other parts of the body. | STRING |
ResearchSubject.id | The 'logical' identifier of the entity in the system of record, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. For CDA, this is case_id. | STRING |
ResearchSubject.identifier | A 'business' identifier for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). Uses a specialized, complex 'Identifier' data type to capture information about the source of the business identifier - or a URI expressed as a string to an existing entity. | RECORD |
ResearchSubject.identifier.system | The system or namespace that defines the identifier. | STRING |
ResearchSubject.identifier.value | The value of the identifier, as defined by the system. | STRING |
ResearchSubject.member_of_research_project | A reference to the Study(s) of which this ResearchSubject is a member. | STRING |
ResearchSubject.primary_diagnosis_condition | The text term used to describe the type of malignant disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O). This attribute represents the disease that qualified the subject for inclusion on the ResearchProject. | STRING |
ResearchSubject.primary_diagnosis_site | The text term used to describe the primary site of disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O). This categorization groups cases into general categories. This attribute represents the primary site of disease that qualified the subject for inclusion on the ResearchProject. | STRING |
Specimen | Any material taken as a sample from a biological entity (living or dead), or from a physical object or the environment. Specimens are usually collected as an example of their kind, often for use in some investigation. | RECORD |
Specimen.anatomical_site | Per GDC Dictionary, the text term that represents the name of the primary disease site of the submitted tumor sample; recommend dropping tumor; biospecimen_anatomic_site. | STRING |
Specimen.associated_project | The Project associated with the specimen. | STRING |
Specimen.days_to_collection | The number of days from the index date to either the date a sample was collected for a specific study or project, or the date a subject underwent a procedure (e.g. surgical resection) yielding a sample that was eventually used for research. | INTEGER |
Specimen.derived_from_specimen | A source/parent specimen from which this one was directly derived. | STRING |
Specimen.derived_from_subject | The Patient/ResearchSubject, or Biologically Derived Materal (e.g. a cell line, tissue culture, organoid) from which the specimen was directly or indirectly derived. | STRING |
Specimen.id | The 'logical' identifier of the entity in the system of record, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. | STRING |
Specimen.identifier | A 'business' identifier or accession number for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). | RECORD |
Specimen.identifier.system | The system or namespace that defines the identifier. | STRING |
Specimen.identifier.value | The value of the identifier, as defined by the system. | STRING |
Specimen.primary_disease_type | The text term used to describe the type of malignant disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O). This attribute represents the disease that qualified the subject for inclusion on the ResearchProject. | STRING |
Specimen.source_material_type | The general kind of material from which the specimen was derived, indicating the physical nature of the source material. | STRING |
Specimen.specimen_type | The high-level type of the specimen, based on its how it has been derived from the original extracted sample. | STRING |
Subject | A patient entity captures the study-independent metadata for research subjects. Human research subjects are usually not traceable to a particular person to protect the subject’s privacy. | RECORD |
Subject.cause_of_death | Coded value indicating the circumstance or condition that results in the death of the subject. | STRING |
Subject.days_to_birth | Number of days between the date used for index and the date from a person's date of birth represented as a calculated negative number of days. | INTEGER |
Subject.days_to_death | Number of days between the date used for index and the date from a person's date of death represented as a calculated number of days. | INTEGER |
Subject.ethnicity | An individual's self-described social and cultural grouping, specifically whether an individual describes themselves as Hispanic or Latino. The provided values are based on the categories defined by the U.S. Office of Management and Business and used by the U.S. Census Bureau. | STRING |
Subject.id | The 'logical' identifier of the entity in the system of record, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. | STRING |
Subject.identifier | A 'business' identifier for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). Uses a specialized, complex 'Identifier' data type to capture information about the source of the business identifier - or a URI expressed as a string to an existing entity. | RECORD |
Subject.identifier.system | The system or namespace that defines the identifier. | STRING |
Subject.identifier.value | The value of the identifier, as defined by the system. | STRING |
Subject.race | An arbitrary classification of a taxonomic group that is a division of a species. It usually arises as a consequence of geographical isolation within a species and is characterized by shared heredity, physical attributes and behavior, and in the case of humans, by common history, nationality, or geographic distribution. The provided values are based on the categories defined by the U.S. Office of Management and Business and used by the U.S. Census Bureau. | STRING |
Subject.sex | The biologic character or quality that distinguishes male and female from one another as expressed by analysis of the person's gonadal, morphologic (internal and external), chromosomal, and hormonal characteristics. | STRING |
Subject.species | The taxonomic group (e.g. species) of the patient. For MVP, since taxonomy vocabulary is consistent between GDC and PDC, using text. Ultimately, this will be a term returned by the vocabulary service. | STRING |
Subject.subject_associated_project | The list of Projects associated with the Subject. | STRING |
Subject.vital_status | Coded value indicating the state or condition of being living or deceased; also includes the case where the vital status is unknown. | STRING |
associated_project | A reference to the Project(s) of which this ResearchSubject is a member. The associated_project may be embedded using the $ref definition or may be a reference to the id for the Project - or a URI expressed as a string to an existing entity. | STRING |
byte_size | Size of the file in bytes. Maps to dcat:byteSize. | INTEGER |
checksum | A digit representing the sum of the correct digits in a piece of stored or transmitted digital data, against which later comparisons can be made to detect errors in the data. | STRING |
data_category | Broad categorization of the contents of the data file. | STRING |
data_modality | Data modality describes the biological nature of the information gathered as the result of an Activity, independent of the technology or methods used to produce the information. | STRING |
data_type | Specific content type of the data file. | STRING |
dbgap_accession_number | The dbgap accession number for the project. | STRING |
drs_uri | A string of characters used to identify a resource on the Data Repo Service(DRS). | STRING |
file_format | Format of the data files. | STRING |
id | The 'logical' identifier of the entity in the repository, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. | STRING |
identifier | A 'business' identifier or accession number for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). | RECORD |
identifier.system | The system or namespace that defines the identifier. | STRING |
identifier.value | The value of the identifier, as defined by the system. | STRING |
imaging_modality | An imaging modality describes the imaging equipment and/or method used to acquire certain structural or functional information about the body. These include but are not limited to computed tomography (CT) and magnetic resonance imaging (MRI). Taken from the DICOM standard. | STRING |
imaging_series | The 'logical' identifier of the series or grouping of imaging files in the system of record which the file is a part of. | STRING |
label | Short name or abbreviation for dataset. Maps to rdfs:label. | STRING |
Specimen.derived_from_subject | The Patient/ResearchSubject, or Biologically Derived Materal (e.g. a cell line, tissue culture, organoid) from which the specimen was directly or indirectly derived. | STRING |
Last update:
2022-06-17