{"success":true,"database":"eegdash","data":{"_id":"6953f4249276ef1ee07a33be","dataset_id":"ds005170","associated_paper_doi":null,"authors":["Zihan Zhang","Yi Zhao","Yu Bao","Xiao Ding"],"bids_version":"1.6.0","contact_info":["Zihan Zhang"],"contributing_labs":null,"data_processed":true,"dataset_doi":"doi:10.18112/openneuro.ds005170.v1.1.2","datatypes":["eeg"],"demographics":{"subjects_count":5,"ages":[26,30,22,28,22],"age_min":22,"age_max":30,"age_mean":25.6,"species":null,"sex_distribution":{"f":2,"m":3},"handedness_distribution":null},"experimental_modalities":null,"external_links":{"source_url":"https://openneuro.org/datasets/ds005170","osf_url":null,"github_url":null,"paper_url":null},"funding":["National Natural Science Foundation of China under Grants 62176079"],"ingestion_fingerprint":"e3c29e2e711fac39e5ddacb2808653865a40309143305296366aa6db8052acd1","license":"CC0","n_contributing_labs":null,"name":"Chisco","readme":"# Chisco Dataset\nThis dataset is a Chinese imagined speech dataset with five participants, identified as sub-01 to sub-05. The dataset includes raw data and preprocessed data in both fif and pkl formats. Information also can be found in https://github.com/zhangzihan-is-good/Chisco\n## Supplementary Information\nThe initial dataset release encompassed data from three participants (sub-01 to sub-03) as detailed in related Chisco publications. Subsequently, data from two additional subjects (sub-04 and sub-05) were incorporated. During the interval between the original dataset release and the addition of the new data, the BIDS protocol underwent updates. To preserve the integrity of the data processing code presented in our publications, the supplementary data continue to adhere to the previous version of the BIDS protocol. Consequently, the BIDS validator on our website may report errors; however, these do not compromise the usability of the dataset.\nFuture releases will include data from sub-06 and sub-07, who participated under a new experimental paradigm. These will be published as part of a new dataset, Chisco 2.0. We invite you to stay tuned for further updates.\n## Dataset Structure\n### Root Directory\n- `dataset_description.json`\n- `participants.tsv`\n- `README`\n- `derivatives/`\n- `sub-01/` to `sub-05/`\n- `textdataset/`\n- `json/`\n### Raw Data\nThe root directory contains folders `sub-01` to `sub-05` with raw data. Each participant's folder contains 5-6 session folders, corresponding to data collected over 5-6 days.\n### Preprocessed Data\nPreprocessed data is stored in the `derivatives` folder in both fif and pkl formats.\n### Text Data\nThe `textdataset` folder and `json` folder contain text data used to stimulate the participants.\n### File Structure\n```\n/Chisco\n    /sub-01\n        /ses-01\n            /eeg\n                sub-01_ses-01_task-imagine_eeg.edf\n        ...\n    /sub-02\n        ...\n    /sub-03\n        ...\n    /derivatives\n        /fif\n            /sub-01\n                ...\n            /sub-02\n                ...\n            /sub-03\n                ...\n        /pkl\n            /sub-01\n                ...\n            /sub-02\n                ...\n            /sub-03\n                ...\n    /textdataset\n        ...\n    /json\n        ...\n    dataset_description.json\n    README\n    participants.tsv\n```\n## License\nThis dataset is licensed under the CC0 license. You are free to use the dataset for non-commercial purposes, but the original author needs to be properly indicated.\n## Citation\nIf you use this dataset in your research, please cite the following link:\nhttps://github.com/zhangzihan-is-good/Chisco\n## Contact Information\nFor any questions, please contact the dataset authors.\nThank you for using the Chisco!","recording_modality":["eeg"],"senior_author":"Xiao Ding","sessions":["01","02","03","04","05","06"],"size_bytes":97416063152,"source":"openneuro","study_design":null,"study_domain":null,"tasks":["imagine"],"timestamps":{"digested_at":"2026-04-22T12:27:24.758171+00:00","dataset_created_at":"2024-05-22T15:04:36.742Z","dataset_modified_at":"2024-12-12T06:43:12.000Z"},"total_files":225,"storage":{"backend":"s3","base":"s3://openneuro.org/ds005170","raw_key":"dataset_description.json","dep_keys":["CHANGES","README","participants.tsv"]},"nemar_citation_count":1,"computed_title":"Chisco","nchans_counts":[],"sfreq_counts":[],"stats_computed_at":"2026-04-22T23:16:00.309058+00:00","tags":{"pathology":["Healthy"],"modality":["Visual"],"type":["Motor"],"confidence":{"pathology":0.65,"modality":0.75,"type":0.8},"reasoning":{"few_shot_analysis":"Most similar few-shot reference is **\"EEG Motor Movement/Imagery Dataset\"** (Healthy / Visual / Motor). Labeling convention: when the paradigm is *imagery of an action* (e.g., motor imagery), the catalog maps this to **Type=Motor**, and the cue is often visual (targets on screen) → **Modality=Visual**. Chisco is also an *imagery* dataset (imagined speech), which is closest in construct to motor imagery/BCI-style paradigms rather than perception, memory, or resting-state. This guides selecting **Type=Motor** and, given text prompts, **Modality=Visual**.\nA weaker secondary analogy is the \"Meta-rdk\" example (Visual / Perception): it shows that when the task is stimulus discrimination it becomes Perception, but Chisco is not discrimination; it is internally generated/imagined output, so the motor/imagery convention fits better.","metadata_analysis":"Key metadata facts:\n1) Task/construct: \"This dataset is a Chinese imagined speech dataset\" and files are named with \"task-imagine\" (\"sub-01_ses-01_task-imagine_eeg.edf\").\n2) Stimulus channel: \"The `textdataset` folder and `json` folder contain text data used to stimulate the participants.\" (text prompts imply visual presentation by default unless explicitly auditory).\n3) Population: only \"five participants\" with \"Age range: 22-30\" and no mention of any diagnosis or patient recruitment, suggesting a normative/healthy cohort.","paper_abstract_analysis":"No useful paper information.","evidence_alignment_check":"Pathology:\n- Metadata says: no disorder is mentioned; participants described only as \"five participants\" with \"Age range: 22-30\".\n- Few-shot pattern suggests: when no clinical recruitment is stated, label as Healthy.\n- Alignment: ALIGN (metadata is non-clinical; few-shot convention maps this to Healthy).\n\nModality:\n- Metadata says: \"text data used to stimulate the participants\" (text prompts).\n- Few-shot pattern suggests: imagery/BCI tasks commonly use on-screen cues/targets (e.g., Motor Movement/Imagery dataset uses visual targets) → Visual modality.\n- Alignment: ALIGN (text cues are consistent with Visual prompts).\n\nType:\n- Metadata says: \"imagined speech dataset\" and \"task-imagine\".\n- Few-shot pattern suggests: imagery-of-action paradigms are labeled Motor (e.g., motor imagery dataset).\n- Alignment: ALIGN (imagined speech is an imagery/production construct, closest to Motor among allowed labels).","decision_summary":"Top-2 candidates and selection:\n\nPathology:\n1) Healthy (winner)\n- Evidence: no diagnosis/patient group stated; \"five participants\"; \"Age range: 22-30\".\n2) Unknown (runner-up)\n- Rationale: metadata does not explicitly say \"healthy\".\nDecision: Healthy, because the dataset describes a generic participant sample with no clinical recruitment facts.\nConfidence basis: one strong absence-of-pathology fact pattern + participant demographics only.\n\nModality:\n1) Visual (winner)\n- Evidence: \"text data used to stimulate the participants\" (text prompts are typically visual).\n- Supporting detail: task label \"task-imagine\" suggests cue-based trials likely presented visually.\n2) Other (runner-up)\n- Rationale: imagined speech could in principle be cued in non-visual ways, but none are stated.\nDecision: Visual.\nConfidence basis: explicit mention of text stimuli.\n\nType:\n1) Motor (winner)\n- Evidence: \"imagined speech dataset\" and \"task-imagine\" → imagery/production paradigm; closest allowed Type is Motor.\n- Few-shot guidance: motor imagery dataset maps imagery paradigms to Type=Motor.\n2) Other (runner-up)\n- Rationale: speech imagery is not limb movement, but still an action/production imagery construct.\nDecision: Motor.\nConfidence basis: explicit imagery description + strong few-shot analog (imagery→Motor)."}},"total_duration_s":null,"tagger_meta":{"config_hash":"3557b68bca409f28","metadata_hash":"71c2e0f1628c06eb","model":"openai/gpt-5.2","tagged_at":"2026-04-07T09:32:40.872789+00:00"},"canonical_name":null,"name_confidence":0.83,"name_meta":{"suggested_at":"2026-04-14T10:18:35.343Z","model":"openai/gpt-5.2 + openai/gpt-5.4-mini + deterministic_fallback"},"name_source":"canonical","author_year":"Zhang2024_Chisco","size_human":"90.7 GB"}}