Lexeme

Validate against: http://json-schema.org/draft-07/schema#

Schema ID: http://schemas.digitallinguistics.io/Lexeme-9.0.0.json

Type: object

Description

A lexeme is an abstract entity that represents all the various forms of a word. In DLx, a lexeme refers broadly to any bundle of related senses and forms, whether the item is an individual word, morpheme, idiom, etc. — anything that constitues a semantic unit. Examples of lexemes in English might include be, run up (a phrasal verb), and ‑ing. A lexeme will typically have multiple senses or meanings, and those are listed in the senses property. It is up to the linguist to decide when two meanings are related, and therefore belong to the same lexeme, or when they belong to different lexemes. A lexeme often also has multiple base forms, such as suppletive forms, irregular forms, or morphologically-conditioned forms. For example, the lexeme be has the base forms be, am, is, etc. These are listed in the forms field. The forms field should not be used to list all the regularly-inflected forms of a word. In addition, individual base forms may have phonologically-conditioned allomorphs, and these are listed in the allomorphs field of the form. The lexeme and its forms and senses may also have variations, such as dialectal and idiolectal variants, rapid speech variants, register-based variants, variations in meaning, or even spelling variants. These are listed in the variants fields. By convention, one of the forms of a lexeme is typically chosen as a representative headword or lemma, and this is indicated by the lemma field. For example, the form man in English is typically used as the lemma/headword for the lexeme that includes the forms man and men. Note that the DLx Lexeme does not represent a lexical entry in a dictionary. Dictionaries typically list each base form of a lexeme as a separate lexical entry. The DLx Lexeme lists puts each of these lexical entries together in the forms field instead.

Developer Notes

This is a top-level database object.

Required Properties

forms
lemma
senses

Dependencies

This object has the following dependencies between properties:

If this object has the variantType property, it must have the following properties as well:
- variantOf

Properties

The following properties are defined for this object:

Type: type

Type: string

Read-only: true

Description

The type of object. Must be set to Lexeme.

This item must have the following value:
```
"Lexeme"
```
ID: id

Description

A unique database identifier for this Lexeme
Lexeme Key: key

Type: string

Description

A human-readable key that uniquely identifies this lexeme or variant within the language. Best practice is for the key to consist of a representation of the lemma form of the word without diacritics, and, if the word is a homonym, the homonym number. However, any value is acceptable as long as it is unique for the language. (Keys do not need to be unique across languages.)

Regular expression to match: ^[^\s]+$
Alternative Analyses: alternativeAnalyses

Type: array

Description

An array of alternative Lexeme objects for this lexeme. This can be useful when working with historical sources or research from other linguists. It allows users to decide on their own analysis, while still maintaining a faithful record of the analyses of the original documentation.

Items must be unique: true

Items

Each item in this array must adhere to the following schema:

Alternative Analysis: alternativeAnalyses

Type: object

Description

A lexeme object representing an alternative analysis of this lexeme.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/Lexeme.json
Bibliography: bibliography

Type: array

Description

A list of citations to attested bibliographic sources where this lexeme appears or is discussed. For precision’s sake, it is recommended that sources be listed for specific senses or forms of a lexeme whenever possible.

Items must be unique: true

Items

Each item in this array must adhere to the following schema:

Citation: bibliography

Type: object

Description

A citation to an attested source for this lexeme.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/Citation.json
Citation Form: citationForm

Type: object

Description

The citation form of a lexeme is the form given when spoken in isolation, which may be different from its lemma form. For example, in Swahili the citation form of a verb is typically the infinitive, e.g. kuandika to write, even though ‑andika is typically used as its lemma form. It may be represented in multiple orthographies. Do not include leading or trailing tokens (e.g. hyphens, equal signs) in this field.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/Transcription.json
Date Created: dateCreated

Type: string

Description

The date and optionally time that this lexeme was created
This item must also validate against exactly one of the following schemas:
- Format: date
- Format: date-time
Date Modified: dateModified

Type: string

Description

The date and optionally time that this lexeme was last modified
This item must also validate against exactly one of the following schemas:
- Format: date
- Format: date-time
Examples: examples

Type: array

Description

A collection of examples illustrating this lexeme in use. Each example is an Utterance from a Text. The Utterance number should be indicated in the index field of the Database Reference object. If using a full Utterance object rather than a Database Reference object, the key field should be included. For precision’s sake, it is recommended that examples be given for individual senses and forms rather than the entire lexeme when possible.

Items must be unique: true

Items

Each item in this array must adhere to the following schema:

Example Utterance (Database Reference): examples

Type: object

Description

A database reference to an Utterance object

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/DatabaseReference.json
Features: features

Type: object

Description

A set of inflectional features for this lexeme (used primarily with grammatical morphemes). Each property should be the name of a feature type (e.g. case, person, number, gender, nounClass, etc.), and its value should be the value for that feature, as a string (e.g. nominative, 1, singular, masculine, etc.). Features may be written more than once, in different languages. For example, a morpheme may have the feature case: accusative (English) as well as caso: acusativo (Spanish).
This item must also validate against all of the following schemas:
- Tags
  
  Type: object
  
  Description
  
  The Features object must be a Tags object
  
  Referenced Schema
  
  This item must validate against the following schema:
  
  http://schemas.digitallinguistics.io/Tags.json
- Additional Properties
  
  Any additional properties must adhere to the following schema:
  
  Feature: 1
  
  Type: string
  
  Description
  
  Individual features must be represented as Strings
  
  Minimum length: 1
Lexeme Base Forms: forms

Type: array

Description

A collection of base forms for this lexeme, i.e. the different forms that this lexeme or morpheme may take, exclusive of its regular inflectional variants. Each base form typically corresponds to a lexical entry in a dictionary. For example: the lexeme man would include the forms man and men; the lexeme run would include the forms run and ran, but not runs or running, because these are regularly-inflected and therefore predictable; the lexeme be would include am, are, is, etc., because these are irregular / suppletive forms, but would not include being.

Items must be unique: true

Items

Each item in this array must adhere to the following schema:

Lexeme Base Form: forms

Type: object

Description

One of the base forms of this lexeme

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/LexemeForm.json
Language (DatabaseReference): language

Type: object

Description

The language of this Lexeme. This property is most useful when working with lexical data from multiple languages.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/DatabaseReference.json
Lemma: lemma

Type: object

Description

A lemma is the form of a lexeme conventionally used to represent that lexeme. It is typically used as the headword in dictionary entries. It may differ drastically from the citation form. For example, the form be is typically used as the lemma form of the English verb forms am, are, is, etc. Lemmas may be represented in multiple orthographies. Do not include any leading or trailing tokens (e.g. hyphens, equal signs).

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/Transcription.json
Lexical Relations: lexicalRelations

Type: array

Description

A list of the lexical relations that this lexeme has to other lexemes. Each item is a Database Reference, and must also have a property called relation, indicating the type of lexical relation. For precision’s sake, lexical relations should be specified for individual senses rather than the entire lexeme whenever possible.

Items must be unique: true
Items

Each item in this array must adhere to the following schema:
lexicalRelations

This item must also validate against all of the following schemas:

Lexeme (Database Reference)

Type: object

Description

A database reference representing a lexical relation between two lexemes or senses. Note: The database reference must also have a relation property specified, indicating the type of lexical relation.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/DatabaseReference.json

Required Properties

relation

Properties

The following properties are defined for this object:

Relation Type: relation

Type: string

Description

The type of lexical relation that holds between the current item and the referenced Lexeme. Can also be used for general cross-references (a compare relation) or historical relationships (a derivedFrom or originOf relation). Examples: antonym, synonym, cognate, derivedFrom, originOf, compare, partOf, hypernymOf, hyponymOf.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/Abbreviation.json
Lexeme Type: lexemeType

Type: string

Description

The type of lexeme this is (either lexical or grammatical). The primary purpose of this field is to make it easy to style interlinear glosses in small caps.
Allowed Values
- "lexical"
- "grammatical"
Link: link

Type: string

Description

A URL where a presentational format for this resource may be viewed

Format: uri
Literal Meaning: literalMeaning

Description

The literal meaning of the lexeme, optionally in multiple languages

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/MultiLangString.json
Media: media

Type: array

Description

Media items associated with this lexeme, such as recordings of the citation form of the word, pictures of the item this word refers to, or videos of the action being performed. If a media item pertains a specific sense or form, it should be placed in that sense or form’s media field instead.

Items must be unique: true

Items

Each item in this array must adhere to the following schema:

Media Item (Database Reference): media

Type: object

Description

A database reference to a media item associated with this lexeme

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/DatabaseReference.json
Morpheme Type: morphemeType

Description

The morphological type for this lexeme. Example values might include "root", "stem", "prefix", "suffix", "compound", "phrasal verb", etc.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/MultiLangString.json
Notes: notes

Type: array

Description

A collection of notes about this lexeme. Each Note object must have its noteType property specified. Notes with a note type of private are not intended for publication in dictionaries, while other types of notes are. For precision’s sake, it is recommended that notes be attached to specific forms or senses whenever possible.

Items must be unique: true
Items

Each item in this array must adhere to the following schema:
notes

This item must also validate against all of the following schemas:

Note

Type: object

Description

A note about this lexeme

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/Note.json

Required Properties

noteType

Properties

The following properties are defined for this object:

Note Type: noteType

Type: string

Description

The type of note about this lexeme

Allowed Values

"private"

"general"

"anthropology"

"discourse"

"encyclopedic"

"grammar"

"phonology"

"semantics"

"sociocultural"

"pragmatic"
Senses: senses

Type: array

Description

A collection of meanings or senses for this lexeme. It is up to the linguist to decide whether two uses of a lexeme are distinct enough to be considered separate senses.

Minimum number of items: 1

Items must be unique: true

Items

Each item in this array must adhere to the following schema:

Sense: senses

Type: object

Description

A sense or meaning for this lexeme.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/Sense.json
Sources: sources

Type: array

Description

A list of the initials of the speaker or speakers who are the attested sources for this lexeme. For precision’s sake, sources should be listed for specific senses or forms of a lexeme whenever possible. These sources should be DatabaseReferences to a Person object.

Items must be unique: true

Items

Each item in this array must adhere to the following schema:

Source: sources

Type: object

Description

An attested source for this lexeme. This will often be the initials of a speaker.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/DatabaseReference.json
Tags: tags

Type: object

Description

A set of tags for this lexeme

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/Tags.json
URL: url

Type: string

Description

A URL where a JSON representation of this lexeme may be retrieved

Format: uri
Variant Of: variantOf

Type: object

Description

If this lexeme is a variant of another lexeme, this field should contain a reference to the other Lexeme. Lexemes may only be variants of one other Lexeme.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/DatabaseReference.json
Variants: variants

Type: array

Description

A list of variants of this this lexeme. This field should be used for dialectal and idiolectal variants, rapid and careful speech variants, register-based variants, variations in meaning, spelling variants, etc. It should not be used for phonologically-conditioned variants (use the allomorphs field of a specific form instead) or morphologically-conditioned variants (use the forms field instead). Each variant should have its variantType property specified.

Items must be unique: true
Items

Each item in this array must adhere to the following schema:
variants

This item must also validate against all of the following schemas:

Variant (Database Reference)

Type: object

Description

A database reference to a variant of this lexeme. Note: The Database Reference object must have a variantType property, indicating the type of variant.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/DatabaseReference.json

Required Properties

variantType

Properties

The following properties are defined for this object:

Variant Type: variantType

Description

This field is be used to specify the type of variant. Possible values might be a person’s name (representing an idiolectal variant), or simply idiolectal, or dialectal (or the name of the dialect), or rapid speech, etc. May be in multiple languages.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/MultiLangString.json
Variant Type: variantType

Description

If this lexeme is a variant of another lexeme or sense, this field can be used to specify the type of variant. Possible values might be a person’s name (representing an idiolectal variant), or simply idiolectal, or dialectal (or the name of the dialect), or rapid speech, etc. Optionally in multiple languages.

Referenced Schema

This item must validate against the following schema:

http://schemas.digitallinguistics.io/MultiLangString.json

Additional Properties

Any additional properties must adhere to the following schema:

This schema imposes no restrictions. All values are valid.

Examples

The following are example values for this schema:

{
  "alternativeAnalyses": [
    {
      "forms": [
        {
          "transcription": {
            "APA": "kʼuš"
          }
        }
      ],
      "lemma": {
        "APA": "kʼuš",
        "IPA": "kˀuš"
      },
      "senses": [
        {
          "argumentStructure": "eat(agent, patient)",
          "gloss": "eat"
        }
      ]
    }
  ],
  "bibliography": [
    {
      "citationKey": "Duralde1802"
    }
  ],
  "citationForm": {
    "APA": "kʼušti",
    "IPA": "kˀušti",
    "Mod": "guxti",
    "Swad": "gušti"
  },
  "dateCreated": "2018-11-03T00:23:55.842Z",
  "dateModified": "2018-11-03T00:24:04.730Z",
  "id": "783cbaa8-befe-4049-bfe4-bb5688173780",
  "key": "guxt-(1)",
  "language": {
    "abbreviation": "chiti",
    "id": "3d91a22d-005b-4ec5-8151-09e44629f58f"
  },
  "lemma": {
    "APA": "kʼušt",
    "IPA": "kˀušt",
    "Mod": "guxt",
    "Swad": "gušt"
  },
  "link": "https://explorer.digitallinguistics.io/languages/Chitimacha/lexemes/guxt",
  "type": "Lexeme",
  "url": "https://data.digitallinguistics.io/languages/Chitimacha/lexemes/guxt",
  "forms": [
    {
      "allomorphs": [
        {
          "environments": [
            "_m"
          ],
          "syllableStructure": "CVC",
          "transcription": {
            "APA": "kʼuš",
            "IPA": "kˀuš",
            "Mod": "gux",
            "Swad": "guš"
          }
        }
      ],
      "components": [
        {
          "id": "e0e2dbdb-f89b-4002-bd46-4f6803bb4391",
          "key": "gux"
        },
        {
          "id": "797f0f6b-3024-4d0c-bbfc-a1bc7cc48b81",
          "key": "t1"
        }
      ],
      "inflectionClass": "main verb",
      "link": "https://data.digitallinguistics.io/languages/Chitimacha/lexemes/guxt/forms/guxt",
      "media": [
        {
          "id": "24a14428-f2e7-47a8-8d4f-00c4437f6c3a",
          "filename": "guxt.wav"
        }
      ],
      "morphemeType": "stem",
      "sources": [
        {
          "source": {
            "abbreviation": "DWH"
          }
        }
      ],
      "syllableStructure": "CVCC",
      "transcription": {
        "APA": "kʼušt",
        "IPA": "kˀušt",
        "Mod": "guxt",
        "Swad": "gušt"
      },
      "variants": [
        {
          "id": "0f765e8d-1401-4c01-a88f-3092d077813b",
          "key": "guxma",
          "variantType": "pluractional"
        }
      ]
    }
  ],
  "senses": [
    {
      "argumentStructure": "eat(agent, patient)",
      "category": "transitive verb",
      "definition": "to eat",
      "gloss": "eat",
      "link": "https://data.digitallinguistics.io/languages/Chitimacha/lexemes/guxt/senses/2"
    }
  ]
}

Lexeme

Description

Developer Notes

Required Properties

Dependencies

Properties

Type: type

Description

ID: id

Description

Lexeme Key: key

Description

Alternative Analyses: alternativeAnalyses

Description

Items

Alternative Analysis: alternativeAnalyses

Description

Referenced Schema

Bibliography: bibliography

Description

Items

Citation: bibliography

Description

Referenced Schema

Citation Form: citationForm

Description

Referenced Schema

Date Created: dateCreated

Description

Date Modified: dateModified

Description

Examples: examples

Description

Items

Example Utterance (Database Reference): examples

Description

Referenced Schema

Features: features

Description

Tags

Description

Referenced Schema

Additional Properties

Feature: 1

Description

Lexeme Base Forms: forms

Description

Items

Lexeme Base Form: forms

Description

Referenced Schema

Language (DatabaseReference): language

Description

Referenced Schema

Lemma: lemma

Description

Referenced Schema

Lexical Relations: lexicalRelations

Description

Items

lexicalRelations

Lexeme (Database Reference)

Description

Referenced Schema

Required Properties

Properties

Relation Type: relation

Description

Referenced Schema

Lexeme Type: lexemeType

Description

Allowed Values

Link: link

Description

Literal Meaning: literalMeaning

Description

Referenced Schema

Media: media

Description

Items

Type: `type`

ID: `id`

Lexeme Key: `key`

Alternative Analyses: `alternativeAnalyses`

Alternative Analysis: `alternativeAnalyses`

Bibliography: `bibliography`

Citation: `bibliography`

Citation Form: `citationForm`

Date Created: `dateCreated`

Date Modified: `dateModified`

Examples: `examples`

Example Utterance (Database Reference): `examples`

Features: `features`

Feature: `1`

Lexeme Base Forms: `forms`

Lexeme Base Form: `forms`

Language (DatabaseReference): `language`

Lemma: `lemma`

Lexical Relations: `lexicalRelations`

`lexicalRelations`

Relation Type: `relation`

Lexeme Type: `lexemeType`

Link: `link`

Literal Meaning: `literalMeaning`

Media: `media`

Media Item (Database Reference): `media`

Morpheme Type: `morphemeType`

Notes: `notes`

`notes`

Note Type: `noteType`

Senses: `senses`

Sense: `senses`

Sources: `sources`

Source: `sources`

Tags: `tags`

URL: `url`

Variant Of: `variantOf`

Variants: `variants`

`variants`

Variant Type: `variantType`

Variant Type: `variantType`