tika

The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.

Name
tika
Main Program
tika-server
Programs
  • tika-app
  • tika-server
Homepage
Version
2.9.3
License
Maintainers
Platforms
  • i686-linux
  • x86_64-linux
  • aarch64-linux
  • armv7l-linux
  • armv6l-linux
  • powerpc64le-linux
  • riscv64-linux
  • aarch64-darwin
  • x86_64-darwin
Defined
Source