模块:Language/name/data/iana data extraction tool
This is a crude tool that reads a local copy of a iso-639-3_Name_Index_YYYYMMDD.tab file from sil.org and extracts the information necessary to create the data table held by Module:Language/data/ISO_639-3
使用
编辑要使用这个工具:
- 打开一个新的沙盒分页并黏贴这个
{{#invoke:}}
到该页面的第一行;{{#invoke:Language/name/data/ISO 639-3 data extraction tool|ISO_639_3_extract|file-date=YYYYMMDD}}
- YYYYMMDD分别是年、月、日,来自.tab 文件名 (used to place a file-date comment in Module:Language/data/ISO_639-3)
- 下载完整的 Code Tables Set UTF-8 version zip file
- 解压缩 iso-639-3_Name_Index_YYYYMMDD.tab 并用纯文本编辑器打开
- 将其中的资料复制并贴在沙盒分页的
{{#invoke:}}
下方 - 点击[显示预览]
- 等待
- 得到结果
有一些粗略的错误检查将在输出中插入错误消息。 不保证这种信息会有帮助。 在工具的输出中搜索“错误”一词。
require('strict');
local p = {};
--[=[------------------------< I S O _ 6 3 9 _ 3 _ E X T R A C T >---------------------------------------------
{{#invoke:Language/name/data/ISO 639-3 data extraction tool|ISO_639_3_extract|file-date=20170217}}
reads a local copy of iso-639-3_Name_Index_YYYYMMDD.tab where (YYYYMMDD is the release date). Download that file
in zip form from http://www-01.sil.org/iso639-3/download.asp (use the UTF-8 zip)
useful lines in the file have the form:
<id>\t<name>\t<inverted name>\n
where:
<id> is the three-character ISO 639-3 language code
<name> is the language 'name'
<inverted name> is the language in 'last-name, first-name(s)' form; this part ignored
like this:
aaq Eastern Abnaki Abnaki, Eastern
when a language code has more than one name, the code is repeated for each additional name:
rar Cook Islands Maori Maori, Cook Islands
rar Rarotongan Rarotongan
]=]
function p.ISO_639_3_extract (frame)
local page = mw.title.getCurrentTitle(); -- get a page object for this page
local content = page:getContent(); -- get unparsed content
local lang_table = {}; -- languages go here
local code;
local names;
local file_date = 'File-Date: ' .. frame.args["file-date"]; -- set the file date line from |file-date=
for code, name in mw.ustring.gmatch (content, '%f[%a](%a%a%a)\t([^\t]+)\t[^\n]+\n') do -- get code and 'forward' name
if code then
if string.find (lang_table[#lang_table] or '', '^%[\"' .. code) then -- if this is an additional name for code ('or' empty string for first time when lang_table[#lang_table] is nil)
lang_table[#lang_table] = mw.ustring.gsub (lang_table[#lang_table], '}$', ''); -- remove trailing brace from previous name
lang_table[#lang_table] = lang_table[#lang_table] .. ', \"' .. name .. '\"}'; -- add this name with new brace
else
table.insert (lang_table, "[\"" .. code .. "\"] = {\"" .. name .. "\"}"); -- make new table entry
end
elseif not code then
table.insert (lang_table, "[\"error\"] = {" .. record .. "}"); -- code should never be nil, but inserting an error entry in the final output can be helpful
end
end
-- make pretty output
return "<br /><pre>-- " .. file_date .. "<br />return {<br />	" .. table.concat (lang_table, ',<br />	') .. "<br />	}<br />" .. "</pre>";
end
return p;