Pumunta sa nilalaman

Module:form of/data/1

Mula Wiksiyonaryo


--[=[
This module lists the more common recognized inflection tags, along with their shortcut aliases, the corresponding
 glossary entry or page describing the tag, and the corresponding wikidata entry. The less common tags are in
[[Module:form of/data/2]]. We divide the tags this way to save memory space. Be careful adding more tags to this module;
add them to the other module unless you're sure they are common.

TAGS is a table where keys are the canonical form of an inflection tag and the corresponding values are tables
describing the tags, consisting of the following keys:

	- 1: Type of the tag ("person", "number", "gender", "case", "animacy", "tense-aspect", "mood", "voice-valence",
		 etc.).
	- 2: Anchor or page describing the inflection tag, with the following values:
		 * nil: No link.
		 * APPENDIX: Anchor in [[Appendix:Glossary]] whose name is the same as the tag
		 * WIKT: Page in the English Wiktionary whose name is the same as the tag.
		 * WP: Page in the English Wikipedia whose name is the same as the tag.
		 * A string: If prefixed by 'w:' the specified page in the English Wikipedia. If prefixed by 'wikt:', the
		   specified page in the English Wiktionary. Otherwise, an anchor in [[Appendix:Glossary]].
		 NOTE: GLOSSARY ANCHORS ARE PREFERRED. Other types of entries should be migrated to the glossary, with links to
		 Wikipedia and/or Wiktionary entries as appropriate.
	- 3: List of shortcuts (i.e. aliases for the inflection tag) or a single shortcut string, or nil.
	- 4: Numeric value of Wikidata identifier (see wikidata.org) for the concept most closely describing this tag.
		 (The actual Wikidata identifier is a string formed by prefixing the number with Q.)
	- display: If specified, consists of text to display in the definition line, in lieu of the canonical form of the
			   inflection tag. If there is a glossary entry, the displayed text forms the right side of the two-part
			   glossary link.
	- no_space_on_left: If specified, don't display a space to the left of the tag. Used for punctuation.
	- no_space_on_right: If specified, don't display a space to the right of the tag. Used for punctuation.

SHORTCUTS is a table mapping shortcut aliases to canonical inflection tag names. Shortcuts are of one of three types:
(1) A simple alias of a tag. These do not need to be entered explicitly into the table; code at the end of the module
	automatically fills in these entries based on the information in TAGS.
(2) An alias to a multipart tag. For example, the alias "mf" maps to the multipart tag "m//f", which will in turn be
	expanded into the canonical multipart tag {"masculine", "feminine"}, which will display as (approximately)
	"[[Appendix:Glossary#gender|masculine]] and [[Appendix:Glossary#gender|feminine]]". The number of such aliases
	should be liminted, and should cover only the most common combinations.

	Normally, multipart tags are displayed using serialCommaJoin() in [[Module:table]] to appropriately join the display
	form of the individual tags using commas and/or "and". However, some multipart tags are displayed specially; see
	DISPLAY_HANDLERS below. Note that aliases to multipart tags can themselves contain simple aliases in them.
(3) An alias to a list of multiple tags (which may themselves be simple or multipart aliases). Specifying the alias is
	exactly equivalent to specifying the tags in the list in order, one after another. An example is "1s", which maps to
	the list {"1", "s"}. The number of such aliases should be limited, and should cover only the most common
	combinations.

NOTE: In some cases below, multiple tags point to the same wikidata, because Wikipedia considers them synonyms. Examples
are indirect case vs. objective case vs. oblique case, and inferential mood vs. renarrative mood. We do this because
(a) we want to allow users to choose their own terminology; (b) we want to be able to use the terminology most common
for the language in question; (c) terms considered synonyms may or may not actually be synonyms, as different languages
may use the terms differently. For example, although the Wikipedia page on [[w:inferential mood]] claims that
inferential and renarrative moods are the same, the page on [[w:Bulgarian_verbs#Evidentials]] claims that Bulgarian has
both, and that they are not the same.
]=]

local m_form_of_data = require("Module:form of/data")

local APPENDIX = m_form_of_data.APPENDIX
local WP = m_form_of_data.WP
local WIKT = m_form_of_data.WIKT

local tags = {}
local shortcuts = {}


----------------------- Person -----------------------

tags["unang panauhan"] = { --TLCHANGE
	"person",
	APPENDIX, --TLCHANGE "first person"
	{"1", "first-person"}, --TLCHANGE
	21714344,
}

tags["ikalawang panauhan"] = { --TLCHANGE
	"person",
	APPENDIX, --TLCHANGE "second person"
	{"2", "second-person"}, --TLCHANGE
	51929049,
}

tags["ikatlong panauhan"] = { --TLCHANGE
	"person",
	APPENDIX, --TLCHANGE "third person"
	{"3", "third-person"}, --TLCHANGE
	51929074,
}

tags["di-panao"] = { --TLCHANGE
	"person",
	APPENDIX,
	{"impers", "impersonal"}, --TLCHANGE
}

shortcuts["12"] = "1//2"
shortcuts["13"] = "1//3"
shortcuts["23"] = "2//3"
shortcuts["123"] = "1//2//3"


----------------------- Number -----------------------

tags["isahan"] = { --TLCHANGE
	"number",
	"isahang bilang", --TLCHANGE "singular number",
	{"s", "sg", "singular"},
	110786,
}

tags["dalawahan"] = { --TLCHANGE
	"number",
	"dalawahang bilang", --TLCHANGE "dual number",
	{"d", "du", "dual"}, --TLCHANGE
	110022,
}

tags["maramihan"] = { --TLCHANGE
	"number",
	"maramihang bilang", --TLCHANGE "plural number",
	{"p", "pl", "plural"}, --TLCHANGE
	146786,
}

tags["single-possession"] = {
	"number",
	"singular number",
	"spos",
	110786, -- Singular
}

tags["multiple-possession"] = {
	"number",
	"plural number",
	"mpos",
	146786, -- Plural
}

shortcuts["1s"] = {"1", "s"}
shortcuts["2s"] = {"2", "s"}
shortcuts["3s"] = {"3", "s"}
shortcuts["1d"] = {"1", "d"}
shortcuts["2d"] = {"2", "d"}
shortcuts["3d"] = {"3", "d"}
shortcuts["1p"] = {"1", "p"}
shortcuts["2p"] = {"2", "p"}
shortcuts["3p"] = {"3", "p"}


----------------------- Gender -----------------------

tags["panlalaki"] = { --TLCHANGE
	"gender",
	"gender",
	{"m", "masculine"}, --TLCHANGE
	499327,
}

-- This is useful e.g. in Swedish.
tags["natural masculine"] = {
	"gender",
	"gender",
	"natm",
}

tags["pambabae"] = { --TLCHANGE
	"gender",
	"gender",
	{"f", "feminine"}, --TLCHANGE
	1775415,
}

tags["pambalaki"] = { --TLCHANGE
	"gender",
	"gender",
	{"n", "neuter"}, --TLCHANGE
	1775461,
}

tags["common"] = {
	"gender",
	"gender",
	"c",
	1305037,
}

tags["nonvirile"] = {
	"gender",
	APPENDIX,
	"nv",
}

shortcuts["mf"] = "m//f"
shortcuts["mn"] = "m//n"
shortcuts["fn"] = "f//n"
shortcuts["mfn"] = "m//f//n"


----------------------- Animacy -----------------------

-- (may be useful sometimes for [[Module:object usage]].)

tags["animate"] = {
	"animacy",
	APPENDIX,
	"an",
	51927507,
}

tags["inanimate"] = {
	"animacy",
	APPENDIX,
	{"in", "inan"},
	51927539,
}

tags["personal"] = {
	"animacy",
	APPENDIX,
	{"pr", "pers"},
	63302102,
}


----------------------- Tense/aspect -----------------------

tags["pangkasalukuyan"] = { --TLCHANGE
	"tense-aspect",
	"panahunang pangkasalukuyan", --TLCHANGE "present tense",
	{"pres", "present"}, --TLCHANGE
	192613,
}

tags["pangnagdaan"] = { --TLCHANGE
	"tense-aspect",
	"panahunang pangnagdaan", --TLCHANGE "past tense",
	"past", --TLCHANGE
	1994301,
}

tags["panghinaharap"] = { --TLCHANGE
	"tense-aspect",
	"panahunang panghinaharap", --TLCHANGE "future tense",
	{"fut", "futr", "future"}, --TLCHANGE
	501405,
}

tags["perpektong panghinaharap"] = { --TLCHANGE
	"tense-aspect",
	APPENDIX,
	{"futp", "fperf", "future perfect"}, --TLCHANGE
	1234617,
}

tags["di-pangnagdaan"] = { --TLCHANGE
	"tense-aspect",
	"panahunang di-pangnagdaan", --TLCHANGE "non-past tense",
	{"npast", "non-past"}, --TLCHANGE
	16916993,
}

tags["progresibo"] = { --TLCHANGE
	"tense-aspect",
	APPENDIX,
	{"prog", "progressive"}, --TLCHANGE
	56653945,
}

tags["preterito"] = { --TLCHANGE no Tagalog equivalent
	"tense-aspect",
	APPENDIX,
	{"pret", "preterite"}, --TLCHANGE
	442485,
}

tags["perpekto"] = { --TLCHANGE no Tagalog equivalent
	"tense-aspect",
	APPENDIX,
	{"perf", "perfect"}, --TLCHANGE
	625420,
}

tags["imperpekto"] = { --TLCHANGE no Tagalog equivalent
	"tense-aspect",
	APPENDIX,
	{"impf", "imperf", "imperfect"}, --TLCHANGE
}

tags["pluskuwamperfekto"] = { --TLCHANGE no Tagalog equivalent, from Spanish "pluscuamperfecto"
	"tense-aspect",
	APPENDIX,
	{"plup", "pluperf", "pluperfect"}, --TLCHANGE
	623742,
}

tags["aoristo"] = { --TLCHANGE no Tagalog equivalent
	"tense-aspect",
	"aorist tense",
	{"aor", "aori", "aorist"}, --TLCHANGE
	216497,
}

tags["makasaysayang pangnagdaan"] = { --TLCHANGE
	"tense-aspect",
	nil,
	{"phis", "past historic"}, --TLCHANGE
	442485,  -- Preterite
}

tags["imperpektibo"] = { --TLCHANGE
	"tense-aspect",
	APPENDIX,
	{"impfv", "imperfv", "imperfective"}, --TLCHANGE
	371427,
}

tags["perpektibo"] = { --TLCHANGE
	"tense-aspect",
	APPENDIX,
	{"pfv", "perfv", "perfective"}, --TLCHANGE
	1424306,
}

shortcuts["spast"] = {"simple", "past"}
shortcuts["simple past"] = {"simple", "past"}
shortcuts["spres"] = {"simple", "present"}
shortcuts["simple present"] = {"simple", "present"}


----------------------- Mood -----------------------

tags["pautos"] = { --TLCHANGE
	"mood",
	"panaganong pautos", --TLCHANGE "imperative mood",
	{"imp", "impr", "impv", "imperative"}, --TLCHANGE
	22716,
}

tags["paturol"] = { --TLCHANGE
	"mood",
	"panaganong paturol", --TLCHANGE "indicative mood",
	{"ind", "indc", "indic", "indicative"}, --TLCHANGE
	682111,
}

tags["pasakali"] = { --TLCHANGE
	"mood",
	"panaganong pasakali", --TLCHANGE "subjunctive mood",
	{"sub", "subj", "subjunctive"}, --TLCHANGE
	473746,
}

tags["panubali"] = { --TLCHANGE
	"mood",
	"panaganong panubali", --TLCHANGE "conditional mood",
	{"cond", "conditional"}, --TLCHANGE
	625581,
}

tags["pamaraan"] = { --TLCHANGE
	"mood",
	"w:modality (linguistics)",
	{"mod", "modal"}, --TLCHANGE
	1243600,
}

tags["optative"] = {
	"mood",
	"optative mood",
	{"opta", "opt"},
	527205,
}

tags["jussive"] = {
	"mood",
	"jussive mood",
	"juss",
	462367,
}

tags["hortative"] = {
	"mood",
	WP,
	"hort",
	5906629,
}


----------------------- Voice/valence -----------------------

-- This tag type combines what is normally called "voice" (active, passive, middle, mediopassive) with other tags that
-- aren't normally called voice but are similar in that they control the valence/valency (number and structure of the
-- arguments of a verb).
tags["tukuyan"] = { --TLCHANGE
	"voice-valence",
	"tukuyan/tahasan", --TLCHANGE "active voice",
	{"act", "actv", "active", "tahasan"}, --TLCHANGE
	1317831,
}

tags["middle"] = {
	"voice-valence",
	"middle voice",
	{"mid", "midl"},
}

tags["balintiyak"] = { --TLCHANGE
	"voice-valence",
	APPENDIX, --TLCHANGE "passive voice",
	{"pass", "pasv", "passive", "kabalikan"}, --TLCHANGE
	1194697,
}

tags["mediopassive"] = {
	"voice-valence",
	APPENDIX,
	{"mp", "mpass", "mpasv", "mpsv"},
	1601545,
}

tags["paiba"] = { --TLCHANGE
	"voice-valence",
	APPENDIX,
	{"refl", "reflexive"}, --TLCHANGE
	13475484, -- for "reflexive verb"
}

tags["palipat"] = { --TLCHANGE
	"voice-valence",
	"pandiwang palipat", --TLCHANGE "transitive verb",
	{"tr", "vt", "transitive"}, --TLCHANGE
	1774805, -- for "transitive verb"
}

tags["katawanin"] = { --TLCHANGE
	"voice-valence",
	"pandiwang katawanin", --TLCHANGE "intransitive verb",
	{"intr", "vi", "intransitive"}, --TLCHANGE
	1166153, -- for "intransitive verb"
}

tags["ditransitive"] = {
	"voice-valence",
	"ditransitive verb",
	"ditr",
	2328313, -- for "ditransitive verb"
}

tags["causative"] = {
	"voice-valence",
	APPENDIX,
	"caus",
	56677011, -- for "causative verb"
}


----------------------- Non-finite -----------------------

tags["pawatas"] = { --TLCHANGE
	"non-finite",
	APPENDIX,
	{"inf", "infinitive"}, --TLCHANGE
	179230,
}

-- A form found in Portuguese and Galician, as well as in Hungarian. This is probably unnecessary and can be replaced
-- with the regular "infinitive" tag. A personal infinitive is not a separate infinitive from the plain infinitive, just
-- an inflection of the infinitive.
tags["personal infinitive"] = {
	"non-finite",
	"w:Portuguese verb conjugation",
	"pinf",
}

tags["pandiwari"] = { --TLCHANGE
	"non-finite",
	APPENDIX,
	{"part", "ptcp", "participle"}, --TLCHANGE
	814722,
}

tags["pangngalang makadiwa"] = { --TLCHANGE
	"non-finite",
	APPENDIX,
	{"vnoun", "verbal noun"}, --TLCHANGE
	1350145,
}

tags["gerund"] = {
	"non-finite",
	APPENDIX,
	"ger",
	1923028,
}

tags["supine"] = {
	"non-finite",
	APPENDIX,
	"sup",
	548470,
}

tags["transgressive"] = {
	"non-finite",
	APPENDIX,
	nil,
	904896,
}


----------------------- Case -----------------------

tags["ablative"] = {
	"case",
	"ablative case",
	"abl",
	156986,
}

tags["akusatibo"] = { --TLCHANGE
	"case",
	"accusative case",
	{"acc", "accusative"}, --TLCHANGE
	146078,
}

tags["dative"] = {
	"case",
	"dative case",
	"dat",
	145599,
}

tags["paari"] = { --TLCHANGE
	"case",
	"genitive case",
	{"gen", "genitive"}, --TLCHANGE
	146233,
}

tags["instrumental"] = {
	"case",
	"instrumental case",
	"ins",
	192997,
}

tags["locative"] = {
	"case",
	"locative case",
	"loc",
	202142,
}

tags["palagyo"] = { --TLCHANGE
	"case",
	"nominative case",
	{"nom", "nominative"}, --TLCHANGE
	131105,
}

tags["prepositional"] = {
	"case",
	"prepositional case",
	{"pre", "prep"},
	2114906,
}

tags["panawag"] = { --TLCHANGE
	"case",
	"vocative case",
	{"voc", "vocative"}, --TLCHANGE
	185077,
}


----------------------- State -----------------------

tags["construct"] = {
	"state",
	"construct state",
	{"cons", "construct state"},
	1641446,
	display = "construct state",
}

tags["definite"] = {
	"state",
	APPENDIX,
	{"def", "defn", "definite state"},
	53997851,
}

tags["panaklaw"] = { --TLCHANGE
	"state",
	APPENDIX,
	{"indef", "indf", "indefinite state", "indefinite"}, --TLCHANGE
	53997857,
}

tags["paukol"] = { --TLCHANGE
	"state",
	WP,
	{"poss", "possessive"}, --TLCHANGE
	2105891,
}

tags["strong"] = {
	"state",
	"indefinite",
	"str",
	53997857, -- Indefinite
}

tags["weak"] = {
	"state",
	"definite",
	"wk",
	53997851, -- Definite
}

tags["mixed"] = {
	"state",
	APPENDIX,
	"mix",
	63302161,
}

tags["attributive"] = {
	"state",
	APPENDIX,
	"attr",
}

tags["predicative"] = {
	"state",
	APPENDIX,
	"pred",
}


----------------------- Degrees of comparison -----------------------

tags["kaantasang lantay"] = { --TLCHANGE
	"comparison",
	"positive",
	{"posd", "positive", "positive degree"}, --TLCHANGE
	3482678, -- Doesn't exist in English; only in Czech, Estonian, Finnish and various Nordic languages.
}

tags["kaantasang pahambing"] = { --TLCHANGE
	"comparison",
	"comparative",
	{"comd", "comparative", "comparative degree"}, --TLCHANGE
	14169499,
}

tags["kaantasang pasukdol"] = { --TLCHANGE
	"comparison",
	"superlative",
	{"supd", "superlative", "superlative degree"}, --TLCHANGE
	1817208,
}


----------------------- Register -----------------------

----------------------- Deixis -----------------------

----------------------- Clusivity -----------------------

----------------------- Inflectional class -----------------------

tags["makahalip"] = { --TLCHANGE
	"class",
	WIKT,
	{"pron", "pronominal"}, --TLCHANGE
	12721180, -- for "pronominal attribute", existing only in the Romanian Wikipedia
}


----------------------- Attitude -----------------------

-- This is a vague tag type grouping augmentative, diminutive and pejorative, which generally indicate the speaker's
-- attitude towards the object in question (as well as often indicating size).

tags["palaki"] = { --TLCHANGE
	"attitude",
	APPENDIX,
	{"aug", "augmentative"}, --TLCHANGE
	1358239,
}

tags["paliit"] = { --TLCHANGE
	"attitude",
	APPENDIX,
	{"dim", "diminutive"},
	108709,
}

tags["pejorative"] = {
	"attitude",
	APPENDIX,
	"pej",
	545779,
}


----------------------- Sound changes -----------------------

tags["contracted"] = {
	"sound change",
	nil,
	"contr",
	126473,
}

tags["uncontracted"] = {
	"sound change",
	nil,
	"uncontr",
}

----------------------- Misc grammar -----------------------

shortcuts["past-cl"] = {"past", "-", "tense", "clause"}
shortcuts["pres-cl"] = {"pres", "-", "tense", "clause"}
shortcuts["fut-cl"] = {"fut", "-", "tense", "clause"}
shortcuts["ind-cl"] = {"ind", "clause"}
shortcuts["sub-cl"] = {"sub", "clause"}
shortcuts["past-sub-cl"] = {"past", "sub", "clause"}
shortcuts["pres-sub-cl"] = {"pres", "sub", "clause"}
shortcuts["fut-sub-cl"] = {"fut", "sub", "clause"}
shortcuts["cond-cl"] = {"cond", "clause"}
shortcuts["cond-past-cl"] = {"cond", "past", "clause"}

tags["simple"] = {
	"grammar",
	nil,
	"sim",
}

tags["short"] = {
	"grammar",
}

tags["long"] = {
	"grammar",
}

tags["anyo"] = { --TLCHANGE
	"grammar",
	nil,
	"form", --TLCHANGE
}

tags["malapang-uri"] = { --TLCHANGE
	"grammar",
	WIKT,
	{"adj", "adjectival"}, --TLCHANGE
}

tags["malapang-abay"] = { --TLCHANGE
	"grammar",
	APPENDIX,
	{"adv", "adverbial"}, --TLCHANGE
}

tags["negative"] = {
	"grammar",
	"w:affirmation and negation",
	"neg",
	63302088,
}

tags["nominalized"] = {
	"grammar",
	nil,
	"nomz",
	4683152, -- entry for "nominalized adjective"
}

tags["nominalization"] = {
	"grammar",
	nil,
	"nomzn",
	1500667,
}

tags["ugat"] = { --TLCHANGE
	"grammar",
	nil,
	"root", --TLCHANGE
	111029,
}

tags["stem"] = {
	"grammar",
	nil,
	nil,
	210523,
}

tags["dependent"] = {
	"grammar",
	nil,
	"dep",
	1122094, -- entry for "dependent clause"
}

tags["independent"] = {
	"grammar",
	nil,
	"indep",
	1419215, -- entry for "independent clause"
}

tags["simuno"] = { --TLCHANGE
	"grammar",
	APPENDIX,
	{"sbj", "subject"}, -- sub and subj used for subjunctive --TLCHANGE 
	164573,
}

tags["layon"] = { --TLCHANGE
	"grammar",
	APPENDIX,
	{"obj", "object"}, --TLCHANGE
	175026,
}

tags["tuwirang layon"] = { --TLCHANGE
	"grammar",
	APPENDIX,
	{"dirobj", "direct object"}, --TLCHANGE
	2990574,
}

tags["di-tuwirang layon"] = { --TLCHANGE
	"grammar",
	APPENDIX,
	{"indirobj", "indirect object"}, --TLCHANGE
	1094061,
}

tags["panahunan"] = { --TLCHANGE
	"grammar",
	APPENDIX,
	"tense", --TLCHANGE
	177691,
}

tags["sugnay"] = { --TLCHANGE
	"grammar",
	APPENDIX,
	"clause", --TLCHANGE
	117364,
}

tags["pararila"] = { --TLCHANGE
	"grammar",
	APPENDIX,
	"phrase", --TLCHANGE
	187931,
}

tags["indirect question"] = {
	"grammar",
	WP,
	{"indq", "indirq"},
	1661444,
}


----------------------- Other tags -----------------------

-- This consists of non-content words like "and" as well as punctuation characters. If the punctuation characters appear
-- by themselves as tags, we special-case the handling of surrounding spaces so the output looks correct.

tags["at"] = { --TLCHANGE
	"other",
	nil,
	"and", --TLCHANGE
}

-- HACK! "in" is a shortcut for "inanimate" so create "!in" to display "in". We should generalize this like for labels.
tags["!in"] = {
	"other",
	display = "in",
}

tags[","] = {
	"other",
	no_space_on_left = true,
}

tags[":"] = {
	"other",
	no_space_on_left = true,
}

tags["/"] = {
	"other",
	no_space_on_left = true,
	no_space_on_right = true,
}

tags["("] = {
	"other",
	no_space_on_right = true,
}

tags[")"] = {
	"other",
	no_space_on_left = true,
}

tags["["] = {
	"other",
	no_space_on_right = true,
}

tags["]"] = {
	"other",
	no_space_on_left = true,
}

tags["-"] = { -- regular hyphen-minus
	"other",
	no_space_on_left = true,
	no_space_on_right = true,
}


----------------------- Create the shortcuts list -----------------------

m_form_of_data.finalize(tags, shortcuts)

return {tags = tags, shortcuts = shortcuts}