start
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
start [2020/04/17 20:14] – [This documentation] simone | start [2022/09/12 19:19] (current) – Stefan Bircher | ||
---|---|---|---|
Line 2: | Line 2: | ||
===== The project ===== | ===== The project ===== | ||
- | The linguistic | + | The data underlying the corpus was collected in 2014 to constitute the data base of the research project " |
+ | ===== Using the corpus ===== | ||
+ | [[https:// | ||
===== The corpus ===== | ===== The corpus ===== | ||
Line 10: | Line 12: | ||
* Number of chats: 617 | * Number of chats: 617 | ||
* Number of messages (with permission to be used): 763’644 | * Number of messages (with permission to be used): 763’644 | ||
+ | * Number of informants (who gave their permission): | ||
* Number of tokens: 5' | * Number of tokens: 5' | ||
* Number of emojis: 382' | * Number of emojis: 382' | ||
Line 18: | Line 21: | ||
* fra: French | * fra: French | ||
* ita: Italian | * ita: Italian | ||
- | * roh: Any variety of Romansh | + | * roh: any variety of Romansh |
* gsw: dialectal German as used in Switzerland | * gsw: dialectal German as used in Switzerland | ||
* deu: non-dialectal German | * deu: non-dialectal German | ||
* eng: English | * eng: English | ||
* spa: Spanish | * spa: Spanish | ||
- | * sla: Any Slavic language | + | * sla: any Slavic language |
Romansh varieties: | Romansh varieties: | ||
* roh-ja: Jauer Romansh | * roh-ja: Jauer Romansh | ||
- | * roh-sr: | + | * roh-sr: |
- | * roh-st: | + | * roh-st: |
- | * roh-sm: | + | * roh-sm: |
- | * roh-pt: | + | * roh-pt: |
- | * roh-vl: | + | * roh-vl: |
- | * roh-gr: | + | * roh-gr: |
Line 41: | Line 44: | ||
[[https:// | [[https:// | ||
- | ===== Using the corpus ===== | + | |
- | This corpus is freely available for academic, non-commercial research. When using the corpus, please make sure to quote correctly. | + | |
Line 55: | Line 57: | ||
==== Creation of the corpus ==== | ==== Creation of the corpus ==== | ||
- | Ueberwasser, | + | Ueberwasser, |
==== The project ==== | ==== The project ==== | ||
Line 61: | Line 63: | ||
===== Raw data ===== | ===== Raw data ===== | ||
- | If you want to use our raw data for computational linguistic projects, please contact [[estark@rom.uzh.ch|Prof. Elisabeth Stark]] to see whether your project complies with our requirements. | + | If you want to use our raw data for computational linguistic projects, please contact [[estark@rom.uzh.ch|Prof. Elisabeth Stark]] to see whether your project complies with our requirements. If we make the data available, a CC BY-NC-ND license is applied. |
start.1587147242.txt.gz · Last modified: 2022/06/27 09:21 (external edit)