超大(html)文件不受限(全文)(正则)搜索(附大量markdown格式词典)

举例来说,一个超大(1G或50G,比如整个wikipedia) markdown/html文件,包含其它媒体如图像等,需要对词条或全文进行(正则)搜索。

要求:

搜索不受工具限制,任意工具均可运行,比如grep, rg, ag.

可跨tag搜索,比如markdown中“_dipl_acusis”,

可搜索“diplacusis”. 点击搜索结果,可呈现包含多种媒体的部分。

我想到的方案,使用"file:///…/wikipedia.html?app=https://…/hugetext.html"先装载超大文件阅读器来处理本地wikipedia.

具体细节见 https://uwebzh.netlify.app/en/html5/index.html

这类html5应用也支持mdict词典查询。

最后附markdown格式词典:

楼主的浏览器看起来很强大,但看说明很复杂,能不能录制一个指导视频?这样更方便用户掌握,也更有利于软件推广。
打开文件必须直接输入url吗,还是能直接点击选择?

1 Like

Sorry to reply in English as I only have Chinese input method in palemoon, which is not allowed to log in this forum. Copied utf8 characters are not shown properly (works when copied as title), does anybody know the underlying reason?

The local file is usually used as a search engine, which could be scanned and generated by simple script or clicking proper links provided on my website.

There are other ways like bookmarks and menus for long-pressed links in file manager.

The huge amount of features (such as html5 applications) of uweb are not expected by regular browser users, so do not appear by default. Usually users need to click links on the website to install search engines/menus/buttons/scripts/configs.