Utterance
Validate against: http://json-schema.org/draft-07/schema#
Schema ID: http://schemas.digitallinguistics.io/Utterance-6.0.0.json
Type: object
Description
The term utterance is intentionally ambiguous, and refers to any unit of a text above the word level. The DLx framework imposes no requirements regarding this size of this unit or how segmentation of the text into units should be accomplished. The user may choose to segment a text based on prosodic units, turns, sentences, or any other appropriate subdivision.
Required Properties
transcription
translation
Dependencies
This object has the following dependencies between properties:
-
If this object has the
startTime
property, it must have the following properties as well:- endTime
-
If this object has the
endTime
property, it must have the following properties as well:- startTime
Properties
The following properties are defined for this object:
Type:
type
Type:
string
Read-only:
true
Description
The type of object. Must be set to
Utterance
.This item must have the following value:
"Utterance"
Key:
key
Type:
string
Description
A key which uniquely identifies this Utterance within the Text. The key for an Utterance consists of the abbreviation of the Text, a period, dash, or underscore, and then the number of this Utterance within the Text (index starts at 1). For example, the third Utterance of a Text with the abbreviation
A
would beA.3
. Keys should be unique within a corpus.Regular expression to match:
^[(a-z)|(A-Z)|(0-9)]+[-_\.][0-9]{1,3}$
End Time:
endTime
Type:
number
Description
The time that the speaker finishes producing this Utterance within the media file(s) associated with this Text. The timestamp should be formatted in SS.MMM (seconds and milliseconds).
Minimum:
0.001
Grammaticality & Acceptability judgments:
judgments
Type:
array
Description
An array of grammaticality judgments or acceptability judgments for this utterance.
Items must be unique:
true
Items
Each item in this array must adhere to the following schema:
Grammaticality / Acceptability judgment:
judgments
Type:
object
Description
A judgment of the grammaticality or acceptability of the utterance. Some linguists distinguish between grammaticality and acceptability, such that some utterances may be considered grammatical but not acceptable. Unacceptable utterances are typically those which are semantically or pragmatically odd in context. It is strongly recommended that a note be included with each judgment, and the source of the judgment indicated in the note.
Required Properties
judgment
judgmentType
Properties
The following properties are defined for this object:
Grammaticality / Acceptability judgment:
judgment
Type:
number
Description
The grammaticality / acceptability judgment for this utterance, represented as a value between 0 (completely ungrammatical / unacceptable) and 1 (completely grammatical / acceptable). Simple binary judgments (“good” vs. “bad”, “grammatical” vs. “ungrammatical”) can simply use the values
0
and1
. Scalar judgments should be normalized to a value between0
and1
. For example, a scale of 1-3 asterisks for grammaticality could be represented as follows:0.00
=***
,0.33
=**
,0.66
=*
,1.00
= completely grammatical.judgment Type:
judgmentType
Type:
string
Description
Indicates whether the judgment is an
acceptability
judgment or agrammaticality
judgment.Allowed Values
"acceptability"
"grammaticality"
judgment Note:
note
Type:
object
Description
A note about this judgment. It is strongly recommended that every judgment be accompanied by a note, indicating the speaker / source of the judgment, and if possible an explanation for unacceptable or ungrammatical judgments.
Referenced Schema
This item must validate against the following schema:
Additional Properties
Any additional properties must adhere to the following schema:
No values are valid for this schema.
Language:
language
Type:
string
Description
The key for the Language used in this Utterance, e.g.
spa
oreng
. If the text is labeled with a Language, all its Utterance are assumed to be the same Language unless labeled otherwise. Likewise, if a Utterance is given a Language, all its words are assumed to be the same Language unless the word is labeled otherwise.Referenced Schema
This item must validate against the following schema:
Link:
link
Type:
string
Description
A URL where a presentational format for this resource may be viewed
Format:
uri
Literal Translation:
literal
Type:
object
Description
The literal translations for this Utterance, optionally in multiple languages.
Referenced Schema
This item must validate against the following schema:
Phonetic:
phonetic
Type:
string
Description
The phonetic transcription for this Utterance in IPA. Only valid IPA characters are allowed. The transcription should not include phonetic brackets.
Notes:
notes
Type:
array
Description
A collection of notes about this Utterance
Items must be unique:
true
Items
Each item in this array must adhere to the following schema:
Note:
notes
Type:
object
Description
A note about this Utterance
Referenced Schema
This item must validate against the following schema:
Speaker:
speaker
Type:
object
Description
The Person who produced (uttered, signed, spoke, sung) this Utterance. The value of this field must match one of the people listed in the
contributors
array of the Text. If the text has a single contributor with the role ofspeaker
, that speaker is assumed to be the speaker for all Utterances in the Text. If multiple contributors with aspeaker
role are included in a text, each Utterance must have itsspeaker
attribute specified.Referenced Schema
This item must validate against the following schema:
Source:
source
Type:
object
Description
A citation to the publication where this utterance was taken from. When the utterance is not part of a text, or when the text consists of random utterances taken from different places, this field is strongly recommended.
Referenced Schema
This item must validate against the following schema:
Start Time:
startTime
Type:
number
Description
The time that the speaker begins producing this Utterance within the media file(s) associated with this Text. The timestamp should be formatted in SS.MMM (seconds and milliseconds).
Minimum:
0
Tags:
tags
Type:
object
Description
A set of tags for this Utterance
Referenced Schema
This item must validate against the following schema:
Transcript:
transcript
Type:
object
Description
A transcript of this Utterance, including things like prosodic markup, overlap, pauses, and various other discourse features. This field is intended for use by those doing discourse or conversation analysis, who need to mark up their text without affecting the phonemic transcription (in the
transcription
property). The transcript may be in multiple orthographies, or representational systems (e.g. you might have aCA
transcript and aDT
transcript, for discourse transcripts using Conversation Analysis and Discourse Transcription conventions respectively).Referenced Schema
This item must validate against the following schema:
Minimum number of properties:
1
Transcription:
transcription
Type:
object
Description
The transcriptions for this Utterance, optionally in multiple orthographies. This field is intended for use with purely phonemic / morphophonemic transcriptions. Punctuation should generally be avoided. To add punctuation and other discourse-level transcriptional features, use the
transcript
property. The transcription must be provided in at least one orthography.Referenced Schema
This item must validate against the following schema:
Minimum number of properties:
1
Translation:
translation
Type:
object
Description
The free translations for this Utterance, optionally in multiple languages. The translation must be provided in at least one language.
Referenced Schema
This item must validate against the following schema:
URL:
url
Type:
string
Description
The URL where this Utterance can be retrieved in JSON format
Format:
uri
Words:
words
Type:
array
Description
A collection of the word tokens contained in this Utterance. Tokens do not need to be unique.
Items must be unique:
false
Items
Each item in this array must adhere to the following schema:
Word:
words
Type:
object
Description
A Word object
Referenced Schema
This item must validate against the following schema:
Additional Properties
Any additional properties must adhere to the following schema:
This schema imposes no restrictions. All values are valid.
Examples
The following are example values for this schema:
-
{ "judgments": [ { "judgment": 0.66, "judgmentType": "acceptability", "note": { "source": { "abbreviation": "BP" }, "text": "Speaker B found this utterance odd because the first two words were contracted." } } ], "literal": { "eng": "one day a man" }, "phonetic": "waʃtˀunkˀu ʔasi", "speaker": { "familyName": "Paul", "givenName": "Benjamin" }, "source": { "citationKey": "Swadesh1946" }, "transcript": { "Mod": "Waxdungu qasi," }, "transcription": { "Mod": "waxdungu qasi" }, "translation": { "eng": "One day a man," }, "words": [] }