{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "3c341e94",
   "metadata": {},
   "source": [
    "# Command Quick Reference\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8387f59",
   "metadata": {},
   "source": [
    "## Loading cdapython package\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d28fac08",
   "metadata": {},
   "outputs": [],
   "source": [
    "from cdapython import Q, columns, unique_terms, query\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "185f91ff",
   "metadata": {},
   "source": [
    "cdapython comes with a few basic packages that are useful for exploring data:\n",
    "- pandas\n",
    "- itables\n",
    "- numpy"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1e70d8e",
   "metadata": {},
   "source": [
    "## cdapython version"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "28a10e23",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2022.10</span>.<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">24</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;36m2022.10\u001b[0m.\u001b[1;36m24\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import cdapython\n",
    "print(Q.get_version())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0810c61c",
   "metadata": {},
   "source": [
    "## columns()\n",
    "\n",
    "Show all searchable columns in the CDA\n",
    "\n",
    "\n",
    "**Usage:**\n",
    "`columns(<optional args>)`\n",
    "\n",
    "**Options:**\n",
    "- `limit = <number>`: Changes the number of columns returned. Default = 100\n",
    "- `description = <True/False>`: Adds a description field for each column name. Default = True\n",
    "\n",
    "**Attributes:**  \n",
    "- `.to_dataframe()`: Returns results as a dataframe\n",
    "- `.to_list()`: Returns results as a list\n",
    "    \n",
    "**Returns:** object containing column names\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e22c697",
   "metadata": {},
   "source": [
    "## unique_terms()\n",
    "\n",
    "For a given column show all unique terms.\n",
    "\n",
    "**Usage:** `unique_terms(<column_name>, <optional args>)`\n",
    "\n",
    "**Options:**\n",
    "- `limit = <number>`: Changes the number of terms returned. Default = 100\n",
    "- `system= <data source>`: Returns only terms that belong to the specified datasource. Currently `GDC`, `IDC`, or `PDC`\n",
    "- `show_counts = <True/False>`: Returns terms as a dataframe that counts how often each term appears. Default = False\n",
    "\n",
    "**Attributes:**  \n",
    "- `.to_dataframe()`: Returns results as a dataframe.\n",
    "- `.to_list()`: Returns results as a list.\n",
    "    \n",
    "**Returns:** searchable terms for a given column"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "398578a6",
   "metadata": {},
   "source": [
    "## to_list()\n",
    "\n",
    "**Usage:** `columns().to_list(<optional args>)`\n",
    "\n",
    "**Options:**\n",
    "- `filters = <string>`: Only returns data that contain `<string>`\n",
    "- `exact = <True/False>`: Alters the behavior of `filters` to only return data that are an exact match to `<string>`\n",
    "\n",
    "**Returns:** results as a list "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "766dd901",
   "metadata": {},
   "source": [
    "## to_dataframe()\n",
    "\n",
    "**Usage:** `columns().to_dataframe(<optional args>)`\n",
    "\n",
    "**Options:**\n",
    "- `search_fields = <array>`: Which column(s) should be searched for a given value, e.g. `[\"description\", fieldName\"] \n",
    "- `search_value = <string>`: The value to search for, e.g. `search_value = \"tumor\"`\n",
    "    \n",
    "**Returns:** results as a dataframe()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b13db3f1",
   "metadata": {},
   "source": [
    "## Q()\n",
    "\n",
    "Executes this query on the public CDA server.\n",
    "\n",
    "**Usage:** `<result> = Q(<column> <operator> <term>)`\n",
    "\n",
    "**Where:**\n",
    "- `<result>` is any user created variable\n",
    "- `<column>` is a value from `columns()`\n",
    "- `<operator>` is a valid comparison operator\n",
    "- `<term>` is a numeric/boolean/unique value from `unique_terms()`\n",
    "\n",
    "**Returns:** cda-python Q data type\n",
    "    \n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "537984c5",
   "metadata": {},
   "source": [
    "### run()\n",
    "\n",
    "Retrieves results of a `Q()` query.\n",
    "\n",
    "**Usage:** `<result>.run(<optional args>)`\n",
    "\n",
    "**Where:**\n",
    "- `<result>` is a user created variable containing an executed `Q()` query\n",
    "\n",
    "- `<result>` is any user created variable\n",
    "- `<column>` is a value from `columns()`\n",
    "- `<operator>` is a valid comparison operator\n",
    "- `<term>` is a numeric/boolean/unique value from `unique_terms()`\n",
    "\n",
    "**Options:**\n",
    "\n",
    "- `filter = '<column>, <column>' `: Only returns columns that contain `<column>`(s). Can be used to return fewer columns from an endpoint, or to add columns that would not normally be returned to an endpoint result. In either case all desired columns must be listed.[1] Please note that returning many extra columns may result in improper joins.\n",
    "\n",
    "- `filter = '<column>:<newname1>, <column>:<newname2>' `: Only returns columns that contain `<column>`(s) and renames column(s) to `<newname>` in output\n",
    "- `limit = <number>`: Changes the number of rows returned. Default = 100\n",
    "\n",
    "**Returns:** cda-python Q data type\n",
    "\n",
    "[1] To add large numbers of columns to a result, it may be easier to get data from two endpoints and merge them, see an example in the [Cohort Building workflow](../../Examples/BuildingACohort/#merging-results-across-endpoints)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a4b7854c",
   "metadata": {},
   "source": [
    "### join_as_str()\n",
    "\n",
    "Returns a single column of data as a string\n",
    "\n",
    "**Usage:** `mycolumn = <result>.subject.run(filter='<column name>').join_as_str(key=\"<column name>\",delimiter=\",\")`\n",
    "\n",
    "**Where:**\n",
    "- `<result>` is a user created variable containing an executed `Q()` query\n",
    "- `<column name>` is the column of desired information\n",
    "\n",
    "The resulting string variable can be directly used as input to a new `Q()` query. This example would retrieve all the files for the subjects that met the conditions of the query above:\n",
    "\n",
    "`Q(f'<column name> IN ({mycolumn})').file.run()`\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4218838",
   "metadata": {},
   "source": [
    "### count()\n",
    "\n",
    "Retrieves summary counts of a `Q()` query.\n",
    "\n",
    "**Usage:** `<result>.count()`\n",
    "\n",
    "**Where:**\n",
    "- `<result>` is a user created variable containing an executed `Q()` query"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "40c5946d",
   "metadata": {},
   "source": [
    "### Valid `Q()` comparison operators \n",
    "\n",
    "`Q` uses the following operators:,\n",
    "\n",
    "- `=` : Equals\n",
    "- `!=` : Not Equal\n",
    "- `OR`\n",
    "- `AND`\n",
    "- `FROM`\n",
    "- `IN` and `NOT IN`\\\n",
    "- `%`: pattern matching a wildcard\n",
    "- `IS` and `IS NOT`\n",
    "- `>`, `>`, `>=`, `<=`: Greater and Less than\n",
    "- `ORDER_BY(<column>:<1 or -1>`: sort by a column\n",
    "\n",
    "Check out the [Operators guide](../../Examples/Operators/#Equals:-=) for more info"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "295707ae",
   "metadata": {},
   "source": [
    "### Print\n",
    "\n",
    "Displays a pretty print version of any results variable.\n",
    "\n",
    "**Usage:** `print(<result>)`\n",
    "\n",
    "**Where:**\n",
    "- `<result>` is a user created variable containing an executed `Q()` query\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b7fbca3",
   "metadata": {},
   "source": [
    "### Pagination\n",
    "\n",
    "By default, `Q` only returns the first 100 rows of data for each search. This keeps search fast while still allowing you to preview the data. Once you have a query that returns the data you want, you'll use the `paginator` function to return all the data as a dataframe or list.\n",
    "\n",
    "#### Auto-pagination to a dataframe\n",
    "\n",
    "**Usage:**\n",
    "\n",
    "*To write to a dataframe*\n",
    "`<result>.auto_paginator(to_df=True)`\n",
    "\n",
    "**Where:**\n",
    "- `<result>` is a user created variable containing an executed `Q()` query\n",
    "\n",
    "\n",
    "#### Looping pagination\n",
    "\n",
    "**Usage:**\n",
    "\n",
    "*To write to a dataframe*\n",
    "\n",
    "``` python\n",
    "\n",
    "import pandas as pd\n",
    "myquery = Q(<your search parameters>) \n",
    "    \n",
    "mydataframe = pd.DataFrame() #create an emptydataframe\n",
    "for i in myquery.subject.run().paginator(to_df=True): #loops through entire result, appending each page to your dataframe\n",
    "    mydataframe = pd.concat([mydataframe, i])\n",
    "```\n",
    "\n",
    "*To write to a list*\n",
    "\n",
    "``` python\n",
    "import pandas as pd\n",
    "myquery = Q(<your search parameters>) \n",
    "\n",
    "mylist = []\n",
    "for i in myquery.subject.run().paginator(to_list=True):\n",
    "    mylist.extend(i)\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "5c867d1980d5b66d2bfc8a5903dcee074b4b68f3917a4f27f8a310cab24e9f1b"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  },
  "metadata": {
   "interpreter": {
    "hash": "5c867d1980d5b66d2bfc8a5903dcee074b4b68f3917a4f27f8a310cab24e9f1b"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
