Lexeme Form

Validate against: http://json-schema.org/draft-07/schema#

Schema ID: http://schemas.digitallinguistics.io/LexemeForm-2.0.0.json

Type: object

Description

The base forms of a lexeme are the minimal set of forms needed to determine the full set of inflectional possibilities of a lexeme. These include suppletive forms, irregular forms, or morphologically-conditioned forms. For example, the lexeme be has the base forms be, am, is, etc., while the lexeme man has the base forms man and men. A base form does not refer to a regularly-inflected, predictable form like being or cats. Principal parts of verbs in Latin are another example of base forms, since they are the minimal set of forms that someone must know to determine all the inflectional possibilities of a verb.

Required Properties

  • transcription

Dependencies

This object has the following dependencies between properties:

  • If this object has the variantType property, it must have the following properties as well:

    • variantOf

Properties

The following properties are defined for this object:

  • Type: type

    Type: string

    Read-only: true

    Description

    The type of object. Must be set to LexemeForm.

    This item must have the following value:

    "LexemeForm"
  • Allomorphs: allomorphs

    Type: array

    Description

    A list of allomorphs (that is, phonologically-conditioned alternants) of this lexeme

    Items must be unique: true

    Items

    Each item in this array must adhere to the following schema:

    Allomorph: allomorphs

    Type: object

    Description

    An allomorph of this lexeme

    Required Properties

    • environments
    • transcription

    Properties

    The following properties are defined for this object:

    • Environments: environments

      Type: array

      Description

      A list of phonological environments in which this allomorph occurs. May be an empty array.

      Items must be unique: true

      Items

      Each item in this array must adhere to the following schema:

      Environment: environments

      Type: string

      Description

      A formalization of a (morpho)phonologial environemnt, e.g. _k

      Minimum length: 1

    • Syllable Structure: syllableStructure

      Type: string

      Description

      An abstract representation of the syllable structure of this allomorph, e.g. CVC

    • Tone: tone

      Type: string

      Description

      An abstract representation of the tonal pattern of this allomorph. Examples: HLH, 313, ˦˨˦ etc.

    • Transcription: transcription

      Type: object

      Description

      A transcription of this allomorph, optionally in multiple orthographies. Do not include any leading or trailing tokens (e.g. hyphens, equal signs).

      Referenced Schema

      This item must validate against the following schema:

      http://schemas.digitallinguistics.io/Transcription.json

    Additional Properties

    Any additional properties must adhere to the following schema:

    This schema imposes no restrictions. All values are valid.

  • Bibliography: bibliography

    Type: array

    Description

    A list of citations to attested sources where this lexeme form appears or is discussed. For precision’s sake, citations should be listed for specific forms of a lexeme rather than the lexeme whenever possible.

    Items must be unique: true

    Items

    Each item in this array must adhere to the following schema:

    Citation: bibliography

    Description

    An attested source for this lexeme form. This will be a citation to a published text in which the lexeme form appears.

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/Citation.json

  • Components: components

    Type: array

    Description

    A list of the morphemes or other lexemes contained within the current form. For example, the form gentlemen in an English lexicon might have references to the lexeme gentle, and the form men within the lexeme for man. Components may reference either an entire lexeme or a specific form. Components do not have to be unique (useful when the same morpheme appears twice in a word).

    Items must be unique: false

    Items

    Each item in this array must adhere to the following schema:

    Lexeme / Lexeme Form (Database Reference): components

    Type: object

    Description

    A database reference to a lexeme or lexeme form

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/DatabaseReference.json

  • Examples: examples

    Type: array

    Description

    A collection of examples illustrating this lexeme form in use. Each example is an Utterance from a Text. The Utterance number should be indicated in the index field of the Database Reference object. If using a full Utterance object rather than a Database Reference object, the key field should be included. For precision’s sake, it is recommended that examples be given for individual forms rather than the entire lexeme when possible.

    Items must be unique: true

    Items

    Each item in this array must adhere to the following schema:

    Example Utterance (Database Reference): examples

    Type: object

    Description

    A database reference to an Utterance object

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/DatabaseReference.json

  • Features: features

    Type: object

    Description

    A set of inflectional features for this lexeme form (used primarily with grammatical morphemes). Each property should be the name of a feature type (e.g. case, person, number, gender, nounClass, etc.), and its value should be the value for that feature, as a string (e.g. nominative, 1, singular, masculine, etc.). Features may be written more than once, in different languages. For example, a morpheme may have the feature case: accusative (English) as well as caso: acusativo (Spanish).

    This item must also validate against all of the following schemas:

    • Tags

      Type: object

      Description

      The Features object must be a Tags object

      Referenced Schema

      This item must validate against the following schema:

      http://schemas.digitallinguistics.io/Tags.json

    • Additional Properties

      Any additional properties must adhere to the following schema:

      Feature: 1

      Type: string

      Description

      Individual features must be represented as Strings

      Minimum length: 1

  • Inflectional Class: inflectionClass

    Description

    If this lexeme is a root or stem, this field indicates the inflectional class that the sense takes. If this lexeme is an inflectional morpheme, this field indicates the inflectional class that the morpheme belongs to. If this lexeme is a derivational morpheme, this field indicates the inflectional class of the derived form. May be written in multiple languages.

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/MultiLangString.json

  • Link: link

    Type: string

    Description

    A URL where a presentational format for this lexeme form may be viewed

    Format: uri

  • Media: media

    Type: array

    Description

    Media items associated with this lexeme form, such as recordings of this form being pronounced. When a media item pertains to a specific lexeme form, it should be placed in that form’s media field rather than the lexeme’s.

    Items must be unique: true

    Items

    Each item in this array must adhere to the following schema:

    Media Item (Database Reference): media

    Type: object

    Description

    A database reference to a media item associated with this lexeme form

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/DatabaseReference.json

  • Morpheme Type: morphemeType

    Description

    The type of morpheme or complex construction that this lexeme is, optionally in multiple languages. Examples: root, stem, bipartite stem, enclitic, prefix, inflected word, phrase, circumfix, compound, complex, ideophonic. Typically, all the forms of a lexeme will have the same morpheme type, but occasionally they differ (e.g. independent vs. cliticized forms of a word, such as are vs. ='re in English).

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/MultiLangString.json

  • Notes: notes

    Type: array

    Description

    A collection of notes about this lexeme form. Each Note object must have its noteType property specified. Notes with a note type of private are not intended for publication in dictionaries, while other types of notes are. For precision’s sake, it is recommended that notes be attached to specific forms rather than the lexeme whenever possible.

    Items must be unique: true

    Items

    Each item in this array must adhere to the following schema:

    notes

    This item must also validate against all of the following schemas:

    • Note

      Type: object

      Description

      A note about this lexeme form

      Referenced Schema

      This item must validate against the following schema:

      http://schemas.digitallinguistics.io/Note.json

    • Required Properties

      • noteType

      Properties

      The following properties are defined for this object:

      • Note Type: noteType

        Type: string

        Description

        The type of note about this lexeme form

        Allowed Values

        • "private"
        • "general"
        • "anthropology"
        • "discourse"
        • "encyclopedic"
        • "grammar"
        • "phonology"
        • "semantics"
        • "sociocultural"
  • Sources: sources

    Type: array

    Description

    A list of the initials of the speaker or speakers who are the attested sources for this lexeme form. For precision’s sake, sources should be listed for specific forms of a lexeme rather than the lexeme whenever possible. These sources should be DatabaseReferences to a Person object.

    Items must be unique: true

    Items

    Each item in this array must adhere to the following schema:

    Source: sources

    Type: object

    Description

    An attested source for this lexeme form. This will often be the initials of a speaker.

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/DatabaseReference.json

  • Syllable Structure: syllableStructure

    Type: string

    Description

    An abstract representation of the syllable structure of this form, e.g. CVC

  • Tags: tags

    Type: object

    Description

    A set of tags for this lexeme form

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/Tags.json

  • Tone: tone

    Type: string

    Description

    An abstract representation of the tonal pattern of this lexeme form. Examples: HLH, 313, ˦˨˦ etc.

  • Transcription: transcription

    Type: object

    Description

    A transcription of this lexeme form, optionally in multiple orthographies

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/Transcription.json

  • Usages: usages

    Type: array

    Description

    A list of MultiLangStrings giving information about this form’s social usage, regional information, register, dialect, and/or connotations. Common values might be “archaic”, “colloquial”, “formal”, “positive”, “negative”, etc.

    Items must be unique: true

    Items

    Each item in this array must adhere to the following schema:

    Usage: usages

    Description

    A MultiLangString giving information about this form’s usage

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/MultiLangString.json

  • Variant Of: variantOf

    Type: object

    Description

    If this lexeme form is a variant of another form, a reference to the other form should go here. For example, some speakers of English have hanged as the past tense of hang, while others have hung.

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/DatabaseReference.json

  • Variants: variants

    Type: array

    Description

    A list of variants of this this form. This field should be used for dialectal and idiolectal variants, rapid and careful speech variants, register-based variants, spelling variants, etc. It should not be used for phonologically-conditioned variants (use the allomorphs field of a specific form instead). Each variant should have its variantType property specified.

    Items must be unique: true

    Items

    Each item in this array must adhere to the following schema:

    variants

    This item must also validate against all of the following schemas:

    • Variant (Database Reference)

      Type: object

      Description

      A database reference to a variant of this form. Note: The Database Reference object must have a variantType property, indicating the type of variant.

      Referenced Schema

      This item must validate against the following schema:

      http://schemas.digitallinguistics.io/DatabaseReference.json

    • Required Properties

      • variantType

      Properties

      The following properties are defined for this object:

      • Variant Type: variantType

        Description

        This field is be used to specify the type of variant. Possible values might be a person’s name (representing an idiolectal variant), or simply idiolectal, or dialectal (or the name of the dialect), or rapid speech, etc. May be in multiple languages.

        Referenced Schema

        This item must validate against the following schema:

        http://schemas.digitallinguistics.io/MultiLangString.json

  • Variant Type: variantType

    Description

    If this form is a variant of another lexeme form, this field can be used to specify the type of variant. Possible values might be a person’s name (representing an idiolectal variant), or simply idiolectal, or dialectal (or the name of the dialect), or rapid speech, etc. Optionally in multiple languages.

    Referenced Schema

    This item must validate against the following schema:

    http://schemas.digitallinguistics.io/MultiLangString.json

Additional Properties

Any additional properties must adhere to the following schema:

This schema imposes no restrictions. All values are valid.

Examples

The following are example values for this schema:

  • {
      "allomorphs": [
        {
          "environments": [
            "_m"
          ],
          "syllableStructure": "CVC",
          "transcription": {
            "APA": "kʼuš",
            "IPA": "kˀuš",
            "Mod": "gux",
            "Swad": "guš"
          }
        }
      ],
      "bibliography": [
        {
          "citationKey": "Duralde1802"
        }
      ],
      "components": [
        {
          "id": "e0e2dbdb-f89b-4002-bd46-4f6803bb4391",
          "key": "gux"
        },
        {
          "id": "797f0f6b-3024-4d0c-bbfc-a1bc7cc48b81",
          "key": "t1"
        }
      ],
      "inflectionClass": "main verb",
      "link": "https://data.digitallinguistics.io/languages/Chitimacha/lexemes/guxt/forms/guxt",
      "media": [
        {
          "id": "24a14428-f2e7-47a8-8d4f-00c4437f6c3a",
          "filename": "guxt.wav"
        }
      ],
      "morphemeType": "stem",
      "sources": [
        {
          "source": {
            "abbreviation": "BP"
          }
        }
      ],
      "syllableStructure": "CVCC",
      "transcription": {
        "APA": "kʼušt",
        "IPA": "kˀušt",
        "Mod": "guxt",
        "Swad": "gušt"
      },
      "usages": [
        "colloquial"
      ],
      "variants": [
        {
          "id": "0f765e8d-1401-4c01-a88f-3092d077813b",
          "key": "guxma",
          "variantType": "pluractional"
        }
      ]
    }