Utterance
Validate against: http://json-schema.org/draft-07/schema#
Schema ID: http://schemas.digitallinguistics.io/Utterance-6.0.0.json
Type: object
Description
The term utterance is intentionally ambiguous, and refers to any unit of a text above the word level. The DLx framework imposes no requirements regarding this size of this unit or how segmentation of the text into units should be accomplished. The user may choose to segment a text based on prosodic units, turns, sentences, or any other appropriate subdivision.
Required Properties
transcriptiontranslation
Dependencies
This object has the following dependencies between properties:
-
If this object has the
startTimeproperty, it must have the following properties as well:- endTime
-
If this object has the
endTimeproperty, it must have the following properties as well:- startTime
Properties
The following properties are defined for this object:
Type:
typeType:
stringRead-only:
trueDescription
The type of object. Must be set to
Utterance.This item must have the following value:
"Utterance"Key:
keyType:
stringDescription
A key which uniquely identifies this Utterance within the Text. The key for an Utterance consists of the abbreviation of the Text, a period, dash, or underscore, and then the number of this Utterance within the Text (index starts at 1). For example, the third Utterance of a Text with the abbreviation
Awould beA.3. Keys should be unique within a corpus.Regular expression to match:
^[(a-z)|(A-Z)|(0-9)]+[-_\.][0-9]{1,3}$End Time:
endTimeType:
numberDescription
The time that the speaker finishes producing this Utterance within the media file(s) associated with this Text. The timestamp should be formatted in SS.MMM (seconds and milliseconds).
Minimum:
0.001Grammaticality & Acceptability judgments:
judgmentsType:
arrayDescription
An array of grammaticality judgments or acceptability judgments for this utterance.
Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Grammaticality / Acceptability judgment:
judgmentsType:
objectDescription
A judgment of the grammaticality or acceptability of the utterance. Some linguists distinguish between grammaticality and acceptability, such that some utterances may be considered grammatical but not acceptable. Unacceptable utterances are typically those which are semantically or pragmatically odd in context. It is strongly recommended that a note be included with each judgment, and the source of the judgment indicated in the note.
Required Properties
judgmentjudgmentType
Properties
The following properties are defined for this object:
Grammaticality / Acceptability judgment:
judgmentType:
numberDescription
The grammaticality / acceptability judgment for this utterance, represented as a value between 0 (completely ungrammatical / unacceptable) and 1 (completely grammatical / acceptable). Simple binary judgments (“good” vs. “bad”, “grammatical” vs. “ungrammatical”) can simply use the values
0and1. Scalar judgments should be normalized to a value between0and1. For example, a scale of 1-3 asterisks for grammaticality could be represented as follows:0.00=***,0.33=**,0.66=*,1.00= completely grammatical.judgment Type:
judgmentTypeType:
stringDescription
Indicates whether the judgment is an
acceptabilityjudgment or agrammaticalityjudgment.Allowed Values
"acceptability""grammaticality"
judgment Note:
noteType:
objectDescription
A note about this judgment. It is strongly recommended that every judgment be accompanied by a note, indicating the speaker / source of the judgment, and if possible an explanation for unacceptable or ungrammatical judgments.
Referenced Schema
This item must validate against the following schema:
Additional Properties
Any additional properties must adhere to the following schema:
No values are valid for this schema.
Language:
languageType:
stringDescription
The key for the Language used in this Utterance, e.g.
spaoreng. If the text is labeled with a Language, all its Utterance are assumed to be the same Language unless labeled otherwise. Likewise, if a Utterance is given a Language, all its words are assumed to be the same Language unless the word is labeled otherwise.Referenced Schema
This item must validate against the following schema:
Link:
linkType:
stringDescription
A URL where a presentational format for this resource may be viewed
Format:
uriLiteral Translation:
literalType:
objectDescription
The literal translations for this Utterance, optionally in multiple languages.
Referenced Schema
This item must validate against the following schema:
Phonetic:
phoneticType:
stringDescription
The phonetic transcription for this Utterance in IPA. Only valid IPA characters are allowed. The transcription should not include phonetic brackets.
Notes:
notesType:
arrayDescription
A collection of notes about this Utterance
Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Note:
notesType:
objectDescription
A note about this Utterance
Referenced Schema
This item must validate against the following schema:
Speaker:
speakerType:
objectDescription
The Person who produced (uttered, signed, spoke, sung) this Utterance. The value of this field must match one of the people listed in the
contributorsarray of the Text. If the text has a single contributor with the role ofspeaker, that speaker is assumed to be the speaker for all Utterances in the Text. If multiple contributors with aspeakerrole are included in a text, each Utterance must have itsspeakerattribute specified.Referenced Schema
This item must validate against the following schema:
Source:
sourceType:
objectDescription
A citation to the publication where this utterance was taken from. When the utterance is not part of a text, or when the text consists of random utterances taken from different places, this field is strongly recommended.
Referenced Schema
This item must validate against the following schema:
Start Time:
startTimeType:
numberDescription
The time that the speaker begins producing this Utterance within the media file(s) associated with this Text. The timestamp should be formatted in SS.MMM (seconds and milliseconds).
Minimum:
0Tags:
tagsType:
objectDescription
A set of tags for this Utterance
Referenced Schema
This item must validate against the following schema:
Transcript:
transcriptType:
objectDescription
A transcript of this Utterance, including things like prosodic markup, overlap, pauses, and various other discourse features. This field is intended for use by those doing discourse or conversation analysis, who need to mark up their text without affecting the phonemic transcription (in the
transcriptionproperty). The transcript may be in multiple orthographies, or representational systems (e.g. you might have aCAtranscript and aDTtranscript, for discourse transcripts using Conversation Analysis and Discourse Transcription conventions respectively).Referenced Schema
This item must validate against the following schema:
Minimum number of properties:
1Transcription:
transcriptionType:
objectDescription
The transcriptions for this Utterance, optionally in multiple orthographies. This field is intended for use with purely phonemic / morphophonemic transcriptions. Punctuation should generally be avoided. To add punctuation and other discourse-level transcriptional features, use the
transcriptproperty. The transcription must be provided in at least one orthography.Referenced Schema
This item must validate against the following schema:
Minimum number of properties:
1Translation:
translationType:
objectDescription
The free translations for this Utterance, optionally in multiple languages. The translation must be provided in at least one language.
Referenced Schema
This item must validate against the following schema:
URL:
urlType:
stringDescription
The URL where this Utterance can be retrieved in JSON format
Format:
uriWords:
wordsType:
arrayDescription
A collection of the word tokens contained in this Utterance. Tokens do not need to be unique.
Items must be unique:
falseItems
Each item in this array must adhere to the following schema:
Word:
wordsType:
objectDescription
A Word object
Referenced Schema
This item must validate against the following schema:
Additional Properties
Any additional properties must adhere to the following schema:
This schema imposes no restrictions. All values are valid.
Examples
The following are example values for this schema:
-
{ "judgments": [ { "judgment": 0.66, "judgmentType": "acceptability", "note": { "source": { "abbreviation": "BP" }, "text": "Speaker B found this utterance odd because the first two words were contracted." } } ], "literal": { "eng": "one day a man" }, "phonetic": "waʃtˀunkˀu ʔasi", "speaker": { "familyName": "Paul", "givenName": "Benjamin" }, "source": { "citationKey": "Swadesh1946" }, "transcript": { "Mod": "Waxdungu qasi," }, "transcription": { "Mod": "waxdungu qasi" }, "translation": { "eng": "One day a man," }, "words": [] }