subject fields

The subject endpoint uses a subject attributes first schema, where all metadata is nested under subject values.

Column names that have a . between words denote that the term after the . is a nested field. Nesting structure can be more easily browsed in the subject JSON schema

column_name description data_type
Files List of ids of File entities associated with the Patient STRING
ResearchSubject A research subject is the entity of interest in a specific research study or project, typically a human being or an animal, but can also be a device, group of humans or animals, or a tissue sample. Human research subjects are usually not traceable to a particular person to protect the subject’s privacy. This entity plays the role of the case_id in existing data. RECORD
ResearchSubject.Diagnosis A collection of characteristics that describe an abnormal condition of the body as assessed at a point in time. May be used to capture information about neoplastic and non-neoplastic conditions. RECORD
ResearchSubject.Diagnosis.Treatment Represent medication administration or other treatment types. RECORD
ResearchSubject.Diagnosis.Treatment.days_to_treatment_end The timepoint at which the treatment ended. INTEGER
ResearchSubject.Diagnosis.Treatment.days_to_treatment_start The timepoint at which the treatment started. INTEGER
ResearchSubject.Diagnosis.Treatment.id The 'logical' identifier of the entity in the repository, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. STRING
ResearchSubject.Diagnosis.Treatment.identifier A 'business' identifier or accession number for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). RECORD
ResearchSubject.Diagnosis.Treatment.identifier.system The system or namespace that defines the identifier. STRING
ResearchSubject.Diagnosis.Treatment.identifier.value The value of the identifier, as defined by the system. STRING
ResearchSubject.Diagnosis.Treatment.number_of_cycles The number of treatment cycles the subject received. INTEGER
ResearchSubject.Diagnosis.Treatment.therapeutic_agent One or more therapeutic agents as part of this treatment. STRING
ResearchSubject.Diagnosis.Treatment.treatment_anatomic_site The anatomical site that the treatment targets. STRING
ResearchSubject.Diagnosis.Treatment.treatment_effect The effect of a treatment on the diagnosis or tumor. STRING
ResearchSubject.Diagnosis.Treatment.treatment_end_reason The reason the treatment ended. STRING
ResearchSubject.Diagnosis.Treatment.treatment_outcome The final outcome of the treatment. STRING
ResearchSubject.Diagnosis.Treatment.treatment_type The treatment type including medication/therapeutics or other procedures. STRING
ResearchSubject.Diagnosis.age_at_diagnosis The age in days of the individual at the time of diagnosis. INTEGER
ResearchSubject.Diagnosis.grade The degree of abnormality of cancer cells, a measure of differentiation, the extent to which cancer cells are similar in appearance and function to healthy cells of the same tissue type. The degree of differentiation often relates to the clinical behavior of the particular tumor. Based on the microscopic findings, tumor grade is commonly described by one of four degrees of severity. Histopathologic grade of a tumor may be used to plan treatment and estimate the future course, outcome, and overall prognosis of disease. Certain types of cancers, such as soft tissue sarcoma, primary brain tumors, lymphomas, and breast have special grading systems. STRING
ResearchSubject.Diagnosis.id The 'logical' identifier of the entity in the repository, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. STRING
ResearchSubject.Diagnosis.identifier A 'business' identifier or accession number for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). RECORD
ResearchSubject.Diagnosis.identifier.system The system or namespace that defines the identifier. STRING
ResearchSubject.Diagnosis.identifier.value The value of the identifier, as defined by the system. STRING
ResearchSubject.Diagnosis.method_of_diagnosis The method used to confirm the subjects malignant diagnosis. STRING
ResearchSubject.Diagnosis.morphology Code that represents the histology of the disease using the third edition of the International Classification of Diseases for Oncology, published in 2000, used principally in tumor and cancer registries for coding the site (topography) and the histology (morphology) of neoplasms. STRING
ResearchSubject.Diagnosis.primary_diagnosis The diagnosis instance that qualified a subject for inclusion on a ResearchProject. STRING
ResearchSubject.Diagnosis.stage The extent of a cancer in the body. Staging is usually based on the size of the tumor, whether lymph nodes contain cancer, and whether the cancer has spread from the original site to other parts of the body. STRING
ResearchSubject.Files List of ids of File entities associated with the ResearchSubject STRING
ResearchSubject.Specimen Any material taken as a sample from a biological entity (living or dead), or from a physical object or the environment. Specimens are usually collected as an example of their kind, often for use in some investigation. RECORD
ResearchSubject.Specimen.Files List of ids of File entities associated with the Specimen STRING
ResearchSubject.Specimen.anatomical_site Per GDC Dictionary, the text term that represents the name of the primary disease site of the submitted tumor sample; recommend dropping tumor; biospecimen_anatomic_site. STRING
ResearchSubject.Specimen.associated_project The Project associated with the specimen. STRING
ResearchSubject.Specimen.days_to_collection The number of days from the index date to either the date a sample was collected for a specific study or project, or the date a subject underwent a procedure (e.g. surgical resection) yielding a sample that was eventually used for research. INTEGER
ResearchSubject.Specimen.derived_from_specimen A source/parent specimen from which this one was directly derived. STRING
ResearchSubject.Specimen.derived_from_subject The Patient/ResearchSubject, or Biologically Derived Materal (e.g. a cell line, tissue culture, organoid) from which the specimen was directly or indirectly derived. STRING
ResearchSubject.Specimen.id The 'logical' identifier of the entity in the system of record, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. STRING
ResearchSubject.Specimen.identifier A 'business' identifier or accession number for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). RECORD
ResearchSubject.Specimen.identifier.system The system or namespace that defines the identifier. STRING
ResearchSubject.Specimen.identifier.value The value of the identifier, as defined by the system. STRING
ResearchSubject.Specimen.primary_disease_type The text term used to describe the type of malignant disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O). This attribute represents the disease that qualified the subject for inclusion on the ResearchProject. STRING
ResearchSubject.Specimen.source_material_type The general kind of material from which the specimen was derived, indicating the physical nature of the source material. STRING
ResearchSubject.Specimen.specimen_type The high-level type of the specimen, based on its how it has been derived from the original extracted sample. STRING
ResearchSubject.id The 'logical' identifier of the entity in the system of record, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. For CDA, this is case_id. STRING
ResearchSubject.identifier A 'business' identifier for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). Uses a specialized, complex 'Identifier' data type to capture information about the source of the business identifier - or a URI expressed as a string to an existing entity. RECORD
ResearchSubject.identifier.system The system or namespace that defines the identifier. STRING
ResearchSubject.identifier.value The value of the identifier, as defined by the system. STRING
ResearchSubject.member_of_research_project A reference to the Study(s) of which this ResearchSubject is a member. STRING
ResearchSubject.primary_diagnosis_condition The text term used to describe the type of malignant disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O). This attribute represents the disease that qualified the subject for inclusion on the ResearchProject. STRING
ResearchSubject.primary_diagnosis_site The text term used to describe the primary site of disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O). This categorization groups cases into general categories. This attribute represents the primary site of disease that qualified the subject for inclusion on the ResearchProject. STRING
cause_of_death Coded value indicating the circumstance or condition that results in the death of the subject. STRING
days_to_birth Number of days between the date used for index and the date from a person's date of birth represented as a calculated negative number of days. INTEGER
days_to_death Number of days between the date used for index and the date from a person's date of death represented as a calculated number of days. INTEGER
ethnicity An individual's self-described social and cultural grouping, specifically whether an individual describes themselves as Hispanic or Latino. The provided values are based on the categories defined by the U.S. Office of Management and Business and used by the U.S. Census Bureau. STRING
id The 'logical' identifier of the entity in the system of record, e.g. a UUID. This 'id' is unique within a given system. The identified entity may have a different 'id' in a different system. STRING
identifier A 'business' identifier for the entity, typically as provided by an external system or authority, that persists across implementing systems (i.e. a 'logical' identifier). Uses a specialized, complex 'Identifier' data type to capture information about the source of the business identifier - or a URI expressed as a string to an existing entity. RECORD
identifier.system The system or namespace that defines the identifier. STRING
identifier.value The value of the identifier, as defined by the system. STRING
race An arbitrary classification of a taxonomic group that is a division of a species. It usually arises as a consequence of geographical isolation within a species and is characterized by shared heredity, physical attributes and behavior, and in the case of humans, by common history, nationality, or geographic distribution. The provided values are based on the categories defined by the U.S. Office of Management and Business and used by the U.S. Census Bureau. STRING
sex The biologic character or quality that distinguishes male and female from one another as expressed by analysis of the person's gonadal, morphologic (internal and external), chromosomal, and hormonal characteristics. STRING
species The taxonomic group (e.g. species) of the patient. For MVP, since taxonomy vocabulary is consistent between GDC and PDC, using text. Ultimately, this will be a term returned by the vocabulary service. STRING
subject_associated_project The list of Projects associated with the Subject. STRING
vital_status Coded value indicating the state or condition of being living or deceased; also includes the case where the vital status is unknown. STRING

Last update: 2022-06-17
Back to top