The Texas German Sample Corpus (TGSC) is a collection of annotated transcripts of spoken Texas German (~13.5 hours, 75,000+ tokens). The TGSC was created to implement and test the language-tagging and normalization guidelines as proposed in Blevins (2022). Texas German is a set of mixed-language contact varieties of German "spoken in Texas which have descended from the dialects of German brought to Texas in the 19th century" by German-speaking immigrants (Boas 2009: 34)." The TGSC is a collection of audio recordings from the Texas German Dialect Archive (TGDA, tgdp.org/dialect-archive) with the following annotation layers: original TGDA literary transcription, tokenization, language tags, normalization, standard German utterance translation, and the original TGDA word-for-word English translation.
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 1 of 1 Result
Aug 10, 2022
Blevins, Margaret, 2022, "Texas German Sample Corpus", https://doi.org/10.18738/T8/IOX9ZA, Texas Data Repository, V1, UNF:6:Av6k1N9dIxbcqPT/kjGimA== [fileUNF]
The Texas German Sample Corpus (TGSC) is a collection of annotated transcripts of spoken Texas German (~13.5 hours, 75,000+ tokens). The TGSC was created to implement and test the language-tagging and normalization guidelines as proposed in Blevins (2022). Texas German is a set o...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.