python-pdfminer - PDF parser and analyser

Property Value
Distribution Ubuntu 16.04 LTS (Xenial Xerus)
Repository Ubuntu Universe amd64
Package name python-pdfminer
Package version 20140328+dfsg
Package release 1
Package architecture all
Package type deb
Installed size 546 B
Download size 98.79 KB
Official Mirror
PDFMiner is a tool for extracting information from PDF documents, which
focuses entirely on getting and analyzing text data. It allows one to obtain
the exact location of text portions in a page, as well as other information
such as fonts or lines. It includes a PDF converter that can transform PDF
files into other text formats (such as HTML). It has an extensible PDF parser
that can be used for other purposes than text analysis.
This package provides the Python module and the command-line tools: pdf2txt
and dumppdf.


Package Version Architecture Repository
python-pdfminer_20140328+dfsg-1_all.deb 20140328+dfsg all Ubuntu Universe
python-pdfminer - - -


Name Value
python:any >= 2.7.5-5~
python:any << 2.8


Type URL
Binary Package python-pdfminer_20140328+dfsg-1_all.deb
Source Package pdfminer

Install Howto

  1. Update the package index:
    # sudo apt-get update
  2. Install python-pdfminer deb package:
    # sudo apt-get install python-pdfminer




2015-11-05 - Daniele Tricoli <>
pdfminer (20140328+dfsg-1) unstable; urgency=medium
[ Jakub Wilk ]
* Use canonical URIs for Vcs-* fields.
[ Daniele Tricoli ]
* New upstream release. (Closes: #741046, #794682)
* debian/compat
- Bump debhelper compatibility level to 9.
* debian/control
- Bump debhelper B-D to (>= 9).
- Bump Standards-Version to 3.9.6 (no changes needed).
- Add dh-python to B-D.
- Bump X-Python-Version to >= 2.6.
- Drop elinks-lite from B-D since it is a transitional package.
- Update Vcs fields for git migration.
* debian/copyright
- Update Format URI.
- Update copyright years.
- Use Files-Excluded to remove non redistributable files and prebuilt
Python objects.
- Rename cmapsrc into cmaprsrc.
* debian/
- Remove because superseded by Files-Excluded in debian/copyright.
* debian/manpages/latin2ascii.1.xml
- Add manpage for latin2ascii.
* debian/patches/pickle-protocol2.diff
- Refresh.
* debian/patches/avoid-timestamped-gzip.patch
- Avoid timestamps in gzip-compressed file and use compressionlevel=9 to
reduce data size.
* debian/
- Remove README.txt since not shipped anymore.
* debian/python-pdfminer.install
- Add /usr/bin/latin2ascii.
* debian/{python-pdfminer.install,rules}
- Don't recreate cmap/ since pdfminer.cmap is not a
Python package anymore.
* debian/rules
- Use uscan inside get-orig-source target.
- Disable tests inside samples directory since they are failing also
* debian/watch
- Use redirector.
2011-08-24 - Daniele Tricoli <>
pdfminer (20110515+dfsg-1) unstable; urgency=low
* New upstream release
* Upload to unstable
* debian/control
- Removed Jakub and added Debian Python Modules Team to Maintainer
- Added myself to Uploaders (Closes: #629178)
- Bumped Standards-Version to 3.9.2 (no changes needed)
* debian/{control,rules}
- Switched to dh_python2
2011-03-05 - Jakub Wilk <>
pdfminer (20110227+dfsg-1) experimental; urgency=low
* New upstream release.
+ Document the -V option in pdf2txt manual page.
* Correct a few grammatical errors in the manual pages and in the package
description. Thanks to Stefano Rivera for help.
* Remove byte-compiled files from (repackaged) upstream tarball.
* Use $() constructs rather than backticks in shell scripts.
* Rename some private variables in debian/rules to make them lowercase.
2010-12-28 - Jakub Wilk <>
pdfminer (20101226+dfsg-1) experimental; urgency=low
* New upstream release.
+ Drop fix-test-psparser.diff, applied upstream.
+ Prevent upstream Makefile from using ‘python2’ binary. [python2.diff]
2010-12-02 - Jakub Wilk <>
pdfminer (20101017+dfsg-1) experimental; urgency=low
* New upstream release.
* Fix a typo in the pdf2txt manual page.
* Backport an upstream patch to fix test failures. [fix-test-psparser.diff]
* To fix FTBFS when built twice in a row:
+ force dh_auto_clean to use distutils build system;
+ add samples/{*.txt,*.html,*.xml} to debian/clean.
2010-08-29 - Jakub Wilk <>
pdfminer (20100829+dfsg-1) experimental; urgency=low
* New upstream release.
* Add mutual Breaks to ensure that if python-pdfminer and pdfminer-data are
installed together, they have the same version.
* Use pickle protocol 2 for serializing data. [pickle-protocol-2.diff]

See Also

Package Description
python-pdfrw-doc_0.2-2_all.deb PDF file manipulation library (documentation)
python-pdfrw_0.2-2_all.deb PDF file manipulation library (Python 2)
python-pdftools_0.37-4_all.deb PDF document reading classes
python-peak.rules_0.5a1+r2713-1_all.deb generic functions support for Python
python-peak.util.decorators_1.8-3_all.deb version-agnostic decorators support for Python
python-peak.util_20160204-1_all.deb utilities from the Python Enterprise Application Kit
python-pebl-doc_1.0.2-3_all.deb Python Environment for Bayesian Learning - documentation
python-pebl_1.0.2-3_amd64.deb Python Environment for Bayesian Learning
python-pefile_1.2.10.139-2_all.deb Portable Executable (PE) parsing module for Python
python-pelican_3.6.3-1_all.deb transitional dummy package
python-pep8-naming_0.3.3-1_all.deb check for PEP 8 naming conventions (flake8 plugin for Python2)
python-pep8_1.7.0-2_all.deb Python PEP 8 code style checker - python
python-persistent_4.1.1-1build2_amd64.deb Automatic persistence for Python objects
python-petname_1.12-0ubuntu1_all.deb python library for generating pronouncable, memorable, pet names
python-pex-cli_1.1.4-1_all.deb transitional dummy package for pex