文本提取:从布局数据中分离出可以翻译的文本。
Text extraction: Separation of translatable text from layout data.
但同等重要的是事实上它还是用于文本提取的标记。
But equally important is the fact that it is also a marker for text extraction.
二是提出了基于规则的相关文本提取算法。
The other is that rule-based related text extraction method is proposed .
我们提供一个文本提取工具,可以用于提取上述所有类型的文件。
We provide one text extraction tool to extract full text from all of the above file types.
本文经过大量研究,提出了一种基于学习的自然场景中文本提取算法。
After many researches, the thesis represents a novel algorithm to extract text from natural scenes, which is based on machine learning.
AXISPRO的关键功能包括数据管理、地理空间分析、链接分析、模式分析和文本提取。
Key AXIS PRO features include data management, geospatial analysis, link analysis, pattern analysis, and text extraction.
这两种方法可以说是从文本提取信息的最常用方法,但在很多情况下这两种方法还不能解决问题。
These are arguably the two most common approaches to extracting information from text, but there are many cases that these two approaches do not cover.
最后,本文解释了如何将PEAR文件导入到InfoSphereWarehouse,并在分析流中使用它从文本提取信息。
Finally, the article explains how to import the PEAR file into InfoSphere Warehouse and use it in an analytic flow to extract information from text.
文中对特殊网页的分析及其文本提取方法的研究,对网页信息挖掘技术研究和网络应用、网络监察具有重要的实际意义。
The analysis of special pages and text extraction methods in this paper has a practical significance in the research of web information technology and the application of networks.
这个类用来从HTML文件中提取出文本信息。
然后,在一个挖掘流中使用这个规则文件,把概念从文本列中提取到关系数据库表中。
We will then use this rule file in a mining flow to extract the concepts from text columns in relational database tables.
在信息提取中,一个常见的任务就是从文本中提取诸如人员、产品或电子邮件地址等概念。
In information extraction, it is a common task to extract concepts such as persons, products, or email addresses from texts.
您还可以打开其他的工件,例如一个用例文件,并强调显示文本以提取一项需求,如图13所示。
You can also open other artifacts, such as a use case document, and highlight text to extract a requirement, as Figure 13 illustrates.
必须提取并翻译ui文本,例如标题、内容和错误消息。
UI text, such as headings, content, and error messages, must be extracted and translated.
其主要任务是在键生成器中从文本中提取单词。
The main task is to extract the words from the texts in the key generator.
文本分析是指使计算机能够从文本中提取意义的过程。
Text analysis is the process of enabling computers to extract meaning from text.
必须找到列索引来提取该列的文本内容。
The column index must be found to extract the textual contents of the column.
查找文本是常见的问题,但是更常见的问题则是希望在找到文本之后将其提取出来。
Finding text is a common problem but, more often than not, you want to extract the text after it's found.
此时,我们只需一个脚本来清除文本、提取信息、并导出所需的XML就大功告成了。
At this point, you just need a script that cleans up the text, extracts the information, outputs the required XML, and you are done.
文本挖掘就是用于从文本中提取信息的数据挖掘技术。
Text mining is data mining applied to information extracted from text.
文本索引通常包含关于从文本文档中提取的相关词的信息。
A text index typically consists of information about relevant terms that are extracted from the text documents.
它本身实际上是一种编程语言,可以实现复杂的逻辑语句,还可以简化部分文本的提取。
It is actually a programming language in and of itself and can be used with complex logic statements, as well as to simply pull out snippets of text.
参见pdftohtml实用工具的poppler - utils网站和包,该工具可简化从pd f文件提取文本。
See the poppler-utils website and package for the pdftohtml utility, which simplifies extracting text from PDF files.
您了解了如何设置UIMA开发环境,如何创建自己的注释器,以及在InfoSphere Warehouse中使用定制注释器从文本输入提取结构化信息。
You have learned how to setup the UIMA development environment and how to create your own annotator and use it in InfoSphere Warehouse to extract structured information from text input.
Substring函数将从源字符串提取文本。
The Substring function will extract text from a source string.
下面的清单9调用CalendarCrypt . pl程序,使用“decrypt ”选项,提取事件文本。
Listing 9 below calls the CalendarCrypt.pl program with the "decrypt" option to extract the event text.
JAPANESE_LEXER:一个从Japanese文本在中提取标记的lexer。
JAPANESE_LEXER: a lexer for extracting tokens from Japanese text.
KOREAN_LEXER:一个从Korean文本在中提取标记的lexer。
KOREAN_LEXER: a lexer for extracting tokens from Korean text.
CHINESE_VGRAM:一个从Chinese文本在中提取标记的lexer。
CHINESE_VGRAM: a lexer for extracting tokens from Chinese text.
CHINESE_VGRAM:一个从Chinese文本在中提取标记的lexer。
CHINESE_VGRAM: a lexer for extracting tokens from Chinese text.
应用推荐