Lexeme
Validate against: http://json-schema.org/draft-07/schema#
Schema ID: http://schemas.digitallinguistics.io/Lexeme-9.0.0.json
Type: object
Description
A lexeme is an abstract entity that represents all the various forms of a word. In DLx, a lexeme refers broadly to any bundle of related senses and forms, whether the item is an individual word, morpheme, idiom, etc. — anything that constitues a semantic unit. Examples of lexemes in English might include be, run up (a phrasal verb), and ‑ing. A lexeme will typically have multiple senses or meanings, and those are listed in the senses property. It is up to the linguist to decide when two meanings are related, and therefore belong to the same lexeme, or when they belong to different lexemes. A lexeme often also has multiple base forms, such as suppletive forms, irregular forms, or morphologically-conditioned forms. For example, the lexeme be has the base forms be, am, is, etc. These are listed in the forms field. The forms field should not be used to list all the regularly-inflected forms of a word. In addition, individual base forms may have phonologically-conditioned allomorphs, and these are listed in the allomorphs field of the form. The lexeme and its forms and senses may also have variations, such as dialectal and idiolectal variants, rapid speech variants, register-based variants, variations in meaning, or even spelling variants. These are listed in the variants fields. By convention, one of the forms of a lexeme is typically chosen as a representative headword or lemma, and this is indicated by the lemma field. For example, the form man in English is typically used as the lemma/headword for the lexeme that includes the forms man and men. Note that the DLx Lexeme does not represent a lexical entry in a dictionary. Dictionaries typically list each base form of a lexeme as a separate lexical entry. The DLx Lexeme lists puts each of these lexical entries together in the forms field instead.
Required Properties
formslemmasenses
Dependencies
This object has the following dependencies between properties:
-
If this object has the
variantTypeproperty, it must have the following properties as well:- variantOf
Properties
The following properties are defined for this object:
Type:
typeType:
stringRead-only:
trueDescription
The type of object. Must be set to
Lexeme.This item must have the following value:
"Lexeme"ID:
idDescription
A unique database identifier for this Lexeme
Lexeme Key:
keyType:
stringDescription
A human-readable key that uniquely identifies this lexeme or variant within the language. Best practice is for the key to consist of a representation of the lemma form of the word without diacritics, and, if the word is a homonym, the homonym number. However, any value is acceptable as long as it is unique for the language. (Keys do not need to be unique across languages.)
Regular expression to match:
^[^\s]+$Alternative Analyses:
alternativeAnalysesType:
arrayDescription
An array of alternative Lexeme objects for this lexeme. This can be useful when working with historical sources or research from other linguists. It allows users to decide on their own analysis, while still maintaining a faithful record of the analyses of the original documentation.
Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Alternative Analysis:
alternativeAnalysesType:
objectDescription
A lexeme object representing an alternative analysis of this lexeme.
Referenced Schema
This item must validate against the following schema:
Bibliography:
bibliographyType:
arrayDescription
A list of citations to attested bibliographic sources where this lexeme appears or is discussed. For precision’s sake, it is recommended that sources be listed for specific senses or forms of a lexeme whenever possible.
Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Citation:
bibliographyType:
objectDescription
A citation to an attested source for this lexeme.
Referenced Schema
This item must validate against the following schema:
Citation Form:
citationFormType:
objectDescription
The citation form of a lexeme is the form given when spoken in isolation, which may be different from its lemma form. For example, in Swahili the citation form of a verb is typically the infinitive, e.g. kuandika
to write, even though ‑andika is typically used as its lemma form. It may be represented in multiple orthographies. Do not include leading or trailing tokens (e.g. hyphens, equal signs) in this field.Referenced Schema
This item must validate against the following schema:
Date Created:
dateCreatedType:
stringDescription
The date and optionally time that this lexeme was created
This item must also validate against exactly one of the following schemas:
Format:
dateFormat:
date-time
Date Modified:
dateModifiedType:
stringDescription
The date and optionally time that this lexeme was last modified
This item must also validate against exactly one of the following schemas:
Format:
dateFormat:
date-time
Examples:
examplesType:
arrayDescription
A collection of examples illustrating this lexeme in use. Each example is an Utterance from a Text. The Utterance number should be indicated in the
indexfield of the Database Reference object. If using a full Utterance object rather than a Database Reference object, thekeyfield should be included. For precision’s sake, it is recommended that examples be given for individual senses and forms rather than the entire lexeme when possible.Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Example Utterance (Database Reference):
examplesType:
objectDescription
A database reference to an Utterance object
Referenced Schema
This item must validate against the following schema:
Features:
featuresType:
objectDescription
A set of inflectional features for this lexeme (used primarily with grammatical morphemes). Each property should be the name of a feature type (e.g.
case,person,number,gender,nounClass, etc.), and its value should be the value for that feature, as a string (e.g.nominative,1,singular,masculine, etc.). Features may be written more than once, in different languages. For example, a morpheme may have the featurecase: accusative(English) as well ascaso: acusativo(Spanish).This item must also validate against all of the following schemas:
Tags
Type:
objectDescription
The Features object must be a Tags object
Referenced Schema
This item must validate against the following schema:
Additional Properties
Any additional properties must adhere to the following schema:
Feature:
1Type:
stringDescription
Individual features must be represented as Strings
Minimum length:
1
Lexeme Base Forms:
formsType:
arrayDescription
A collection of base forms for this lexeme, i.e. the different forms that this lexeme or morpheme may take, exclusive of its regular inflectional variants. Each base form typically corresponds to a lexical entry in a dictionary. For example: the lexeme man would include the forms man and men; the lexeme run would include the forms run and ran, but not runs or running, because these are regularly-inflected and therefore predictable; the lexeme be would include am, are, is, etc., because these are irregular / suppletive forms, but would not include being.
Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Lexeme Base Form:
formsType:
objectDescription
One of the base forms of this lexeme
Referenced Schema
This item must validate against the following schema:
Language (DatabaseReference):
languageType:
objectDescription
The language of this Lexeme. This property is most useful when working with lexical data from multiple languages.
Referenced Schema
This item must validate against the following schema:
Lemma:
lemmaType:
objectDescription
A lemma is the form of a lexeme conventionally used to represent that lexeme. It is typically used as the headword in dictionary entries. It may differ drastically from the citation form. For example, the form be is typically used as the lemma form of the English verb forms am, are, is, etc. Lemmas may be represented in multiple orthographies. Do not include any leading or trailing tokens (e.g. hyphens, equal signs).
Referenced Schema
This item must validate against the following schema:
Lexical Relations:
lexicalRelationsType:
arrayDescription
A list of the lexical relations that this lexeme has to other lexemes. Each item is a Database Reference, and must also have a property called
relation, indicating the type of lexical relation. For precision’s sake, lexical relations should be specified for individual senses rather than the entire lexeme whenever possible.Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
lexicalRelationsThis item must also validate against all of the following schemas:
Lexeme (Database Reference)
Type:
objectDescription
A database reference representing a lexical relation between two lexemes or senses. Note: The database reference must also have a
relationproperty specified, indicating the type of lexical relation.Referenced Schema
This item must validate against the following schema:
Required Properties
relation
Properties
The following properties are defined for this object:
Relation Type:
relationType:
stringDescription
The type of lexical relation that holds between the current item and the referenced Lexeme. Can also be used for general cross-references (a
comparerelation) or historical relationships (aderivedFromororiginOfrelation). Examples:antonym,synonym,cognate,derivedFrom,originOf,compare,partOf,hypernymOf,hyponymOf.Referenced Schema
This item must validate against the following schema:
Lexeme Type:
lexemeTypeType:
stringDescription
The type of lexeme this is (either lexical or grammatical). The primary purpose of this field is to make it easy to style interlinear glosses in small caps.
Allowed Values
"lexical""grammatical"
Link:
linkType:
stringDescription
A URL where a presentational format for this resource may be viewed
Format:
uriLiteral Meaning:
literalMeaningDescription
The literal meaning of the lexeme, optionally in multiple languages
Referenced Schema
This item must validate against the following schema:
Media:
mediaType:
arrayDescription
Media items associated with this lexeme, such as recordings of the citation form of the word, pictures of the item this word refers to, or videos of the action being performed. If a media item pertains a specific sense or form, it should be placed in that sense or form’s
mediafield instead.Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Media Item (Database Reference):
mediaType:
objectDescription
A database reference to a media item associated with this lexeme
Referenced Schema
This item must validate against the following schema:
Morpheme Type:
morphemeTypeDescription
The morphological type for this lexeme. Example values might include
"root","stem","prefix","suffix","compound","phrasal verb", etc.Referenced Schema
This item must validate against the following schema:
Notes:
notesType:
arrayDescription
A collection of notes about this lexeme. Each Note object must have its
noteTypeproperty specified. Notes with a note type ofprivateare not intended for publication in dictionaries, while other types of notes are. For precision’s sake, it is recommended that notes be attached to specific forms or senses whenever possible.Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
notesThis item must also validate against all of the following schemas:
Note
Type:
objectDescription
A note about this lexeme
Referenced Schema
This item must validate against the following schema:
Required Properties
noteType
Properties
The following properties are defined for this object:
Note Type:
noteTypeType:
stringDescription
The type of note about this lexeme
Allowed Values
"private""general""anthropology""discourse""encyclopedic""grammar""phonology""semantics""sociocultural""pragmatic"
Senses:
sensesType:
arrayDescription
A collection of meanings or senses for this lexeme. It is up to the linguist to decide whether two uses of a lexeme are distinct enough to be considered separate senses.
Minimum number of items:
1Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Sense:
sensesType:
objectDescription
A sense or meaning for this lexeme.
Referenced Schema
This item must validate against the following schema:
Sources:
sourcesType:
arrayDescription
A list of the initials of the speaker or speakers who are the attested sources for this lexeme. For precision’s sake, sources should be listed for specific senses or forms of a lexeme whenever possible. These sources should be DatabaseReferences to a Person object.
Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
Source:
sourcesType:
objectDescription
An attested source for this lexeme. This will often be the initials of a speaker.
Referenced Schema
This item must validate against the following schema:
Tags:
tagsType:
objectDescription
A set of tags for this lexeme
Referenced Schema
This item must validate against the following schema:
URL:
urlType:
stringDescription
A URL where a JSON representation of this lexeme may be retrieved
Format:
uriVariant Of:
variantOfType:
objectDescription
If this lexeme is a variant of another lexeme, this field should contain a reference to the other Lexeme. Lexemes may only be variants of one other Lexeme.
Referenced Schema
This item must validate against the following schema:
Variants:
variantsType:
arrayDescription
A list of variants of this this lexeme. This field should be used for dialectal and idiolectal variants, rapid and careful speech variants, register-based variants, variations in meaning, spelling variants, etc. It should not be used for phonologically-conditioned variants (use the
allomorphsfield of a specific form instead) or morphologically-conditioned variants (use theformsfield instead). Each variant should have itsvariantTypeproperty specified.Items must be unique:
trueItems
Each item in this array must adhere to the following schema:
variantsThis item must also validate against all of the following schemas:
Variant (Database Reference)
Type:
objectDescription
A database reference to a variant of this lexeme. Note: The Database Reference object must have a
variantTypeproperty, indicating the type of variant.Referenced Schema
This item must validate against the following schema:
Required Properties
variantType
Properties
The following properties are defined for this object:
Variant Type:
variantTypeDescription
This field is be used to specify the type of variant. Possible values might be a person’s name (representing an idiolectal variant), or simply
idiolectal, ordialectal(or the name of the dialect), orrapid speech, etc. May be in multiple languages.Referenced Schema
This item must validate against the following schema:
Variant Type:
variantTypeDescription
If this lexeme is a variant of another lexeme or sense, this field can be used to specify the type of variant. Possible values might be a person’s name (representing an idiolectal variant), or simply
idiolectal, ordialectal(or the name of the dialect), orrapid speech, etc. Optionally in multiple languages.Referenced Schema
This item must validate against the following schema:
Additional Properties
Any additional properties must adhere to the following schema:
This schema imposes no restrictions. All values are valid.
Examples
The following are example values for this schema:
-
{ "alternativeAnalyses": [ { "forms": [ { "transcription": { "APA": "kʼuš" } } ], "lemma": { "APA": "kʼuš", "IPA": "kˀuš" }, "senses": [ { "argumentStructure": "eat(agent, patient)", "gloss": "eat" } ] } ], "bibliography": [ { "citationKey": "Duralde1802" } ], "citationForm": { "APA": "kʼušti", "IPA": "kˀušti", "Mod": "guxti", "Swad": "gušti" }, "dateCreated": "2018-11-03T00:23:55.842Z", "dateModified": "2018-11-03T00:24:04.730Z", "id": "783cbaa8-befe-4049-bfe4-bb5688173780", "key": "guxt-(1)", "language": { "abbreviation": "chiti", "id": "3d91a22d-005b-4ec5-8151-09e44629f58f" }, "lemma": { "APA": "kʼušt", "IPA": "kˀušt", "Mod": "guxt", "Swad": "gušt" }, "link": "https://explorer.digitallinguistics.io/languages/Chitimacha/lexemes/guxt", "type": "Lexeme", "url": "https://data.digitallinguistics.io/languages/Chitimacha/lexemes/guxt", "forms": [ { "allomorphs": [ { "environments": [ "_m" ], "syllableStructure": "CVC", "transcription": { "APA": "kʼuš", "IPA": "kˀuš", "Mod": "gux", "Swad": "guš" } } ], "components": [ { "id": "e0e2dbdb-f89b-4002-bd46-4f6803bb4391", "key": "gux" }, { "id": "797f0f6b-3024-4d0c-bbfc-a1bc7cc48b81", "key": "t1" } ], "inflectionClass": "main verb", "link": "https://data.digitallinguistics.io/languages/Chitimacha/lexemes/guxt/forms/guxt", "media": [ { "id": "24a14428-f2e7-47a8-8d4f-00c4437f6c3a", "filename": "guxt.wav" } ], "morphemeType": "stem", "sources": [ { "source": { "abbreviation": "DWH" } } ], "syllableStructure": "CVCC", "transcription": { "APA": "kʼušt", "IPA": "kˀušt", "Mod": "guxt", "Swad": "gušt" }, "variants": [ { "id": "0f765e8d-1401-4c01-a88f-3092d077813b", "key": "guxma", "variantType": "pluractional" } ] } ], "senses": [ { "argumentStructure": "eat(agent, patient)", "category": "transitive verb", "definition": "to eat", "gloss": "eat", "link": "https://data.digitallinguistics.io/languages/Chitimacha/lexemes/guxt/senses/2" } ] }
Developer Notes
This is a top-level database object.