libbmc package¶

Submodules¶

libbmc.bibtex module¶

This file contains functions to deal with Bibtex files and edit them.

TODO: Unittests

libbmc.bibtex.append(filename, data)[source]¶

Append some entries to a bibtex file.

Parameters:	filename – The name of the BibTeX file to edit. data – A `bibtexparser.BibDatabase` object.

libbmc.bibtex.bibdatabase2bibtex(data)[source]¶

Convert a BibDatabase object to a BibTeX string.

Parameters:	data – A `bibtexparser.BibDatabase` object.
Returns:	A formatted BibTeX string.

libbmc.bibtex.delete(filename, identifier)[source]¶

Delete an entry in a BibTeX file.

Parameters:	filename – The name of the BibTeX file to edit. identifier – The id of the entry to delete, in the BibTeX file.

libbmc.bibtex.dict2bibtex(data)[source]¶

Convert a single BibTeX entry dict to a BibTeX string.

Parameters:	data – A dict representing BibTeX entry, as the ones from `bibtexparser.BibDatabase.entries` output.
Returns:	A formatted BibTeX string.

libbmc.bibtex.edit(filename, identifier, data)[source]¶

Update an entry in a BibTeX file.

Parameters:	filename – The name of the BibTeX file to edit. identifier – The id of the entry to update, in the BibTeX file. data – A dict associating fields and updated values. Fields present in the BibTeX file but not in this dict will be kept as is.

libbmc.bibtex.get(filename, ignore_fields=None)[source]¶

Get all entries from a BibTeX file.

Parameters:	filename – The name of the BibTeX file. ignore_fields – An optional list of fields to strip from the BibTeX file.
Returns:	A `bibtexparser.BibDatabase` object representing the fetched entries.

libbmc.bibtex.get_entry(filename, identifier, ignore_fields=None)[source]¶

Get an entry from a BibTeX file.

Parameters:	filename – The name of the BibTeX file. identifier – An id of the entry to fetch, in the BibTeX file. ignore_fields – An optional list of fields to strip from the BibTeX file.
Returns:	A `bibtexparser.BibDatabase` object representing the fetched entry. `None` if entry was not found.

libbmc.bibtex.get_entry_by_filter(filename, filter_function, ignore_fields=None)[source]¶

Get an entry from a BibTeX file.

Note

Returns the first matching entry.

Parameters:	filename – The name of the BibTeX file. filter_function – A function returning `True` or `False` whether the entry should be included or not. ignore_fields – An optional list of fields to strip from the BibTeX file.
Returns:	A `bibtexparser.BibDatabase` object representing the first matching entry. `None` if entry was not found.

libbmc.bibtex.replace(filename, identifier, data)[source]¶

Replace an entry in a BibTeX file.

Parameters:	filename – The name of the BibTeX file to edit. identifier – The id of the entry to replace, in the BibTeX file. data – A `bibtexparser.BibDatabase` object containing a single entry.

libbmc.bibtex.to_filename(data, mask='{first}_{last}-{journal}-{year}{arxiv_version}', extra_formatters=None)[source]¶

Convert a bibtex entry to a formatted filename according to a given mask.

Note

Available formatters out of the box are:

journal
title
year
first for the first author
last for the last author
authors for the list of authors
arxiv_version (discarded if no arXiv version in the BibTeX)

Filename is slugified after applying the masks.

Parameters:	data – A `bibtexparser.BibDatabase` object representing a BibTeX entry, as the one from `bibtexparser` output. mask – A Python format string. extra_formatters – A dict of format string (in the mask) and associated lambdas to perform the formatting.
Returns:	A formatted filename.

libbmc.bibtex.write(filename, data)[source]¶

Create a new BibTeX file.

Parameters:	filename – The name of the BibTeX file to write. data – A `bibtexparser.BibDatabase` object.

libbmc.doi module¶

This file contains all the DOI-related functions.

libbmc.doi.extract_from_text(text)[source]¶

Extract canonical DOIs from a text.

Parameters:	text – The text to extract DOIs from.
Returns:	A list of found DOIs.

>>> sorted(extract_from_text('10.1209/0295-5075/111/40005 10.1016.12.31/nature.S0735-1097(98)2000/12/31/34:7-7 10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S 10.1007/978-3-642-28108-2_19 10.1007.10/978-3-642-28108-2_19 10.1016/S0735-1097(98)00347-7 10.1579/0044-7447(2006)35\[89:RDUICP\]2.0.CO;2 <geo coords="10.4515260,51.1656910"></geo>'))
['10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S', '10.1007.10/978-3-642-28108-2_19', '10.1007/978-3-642-28108-2_19', '10.1016.12.31/nature.S0735-1097(98)2000/12/31/34:7-7', '10.1016/S0735-1097(98)00347-7', '10.1209/0295-5075/111/40005', '10.1579/0044-7447(2006)35\\[89:RDUICP\\]2.0.CO;2']

libbmc.doi.get_bibtex(doi)[source]¶

Get a BibTeX entry for a given DOI.

Note

Adapted from https://gist.github.com/jrsmith3/5513926.

Parameters:	doi – The canonical DOI to get BibTeX from.
Returns:	A BibTeX string or `None`.

>>> get_bibtex('10.1209/0295-5075/111/40005')
'@article{Verney_2015,\n\tdoi = {10.1209/0295-5075/111/40005},\n\turl = {http://dx.doi.org/10.1209/0295-5075/111/40005},\n\tyear = 2015,\n\tmonth = {aug},\n\tpublisher = {{IOP} Publishing},\n\tvolume = {111},\n\tnumber = {4},\n\tpages = {40005},\n\tauthor = {Lucas Verney and Lev Pitaevskii and Sandro Stringari},\n\ttitle = {Hybridization of first and second sound in a weakly interacting Bose gas},\n\tjournal = {{EPL}}\n}'

libbmc.doi.get_linked_version(doi)[source]¶

Get the original link behind the DOI.

Parameters:	doi – A canonical DOI.
Returns:	The canonical URL behind the DOI, or `None`.

>>> get_linked_version('10.1209/0295-5075/111/40005')
'http://stacks.iop.org/0295-5075/111/i=4/a=40005?key=crossref.9ad851948a976ecdf216d4929b0b6f01'

libbmc.doi.get_oa_policy(doi)[source]¶

Get OA policy for a given DOI.

Note

Uses beta.dissem.in API.

Parameters:	doi – A canonical DOI.
Returns:	The OpenAccess policy for the associated publications, or `None` if unknown.

>>> tmp = get_oa_policy('10.1209/0295-5075/111/40005'); (tmp["published"], tmp["preprint"], tmp["postprint"], tmp["romeo_id"])
('can', 'can', 'can', '1896')

>>> get_oa_policy('10.1215/9780822387268') is None
True

libbmc.doi.get_oa_version(doi)[source]¶

Get an OA version for a given DOI.

Note

Uses beta.dissem.in API.

Parameters:	doi – A canonical DOI.
Returns:	The URL of the OA version of the given DOI, or `None`.

>>> get_oa_version('10.1209/0295-5075/111/40005')
'http://arxiv.org/abs/1506.06690'

libbmc.doi.is_valid(doi)[source]¶

Check that a given DOI is a valid canonical DOI.

Parameters:	doi – The DOI to be checked.
Returns:	Boolean indicating whether the DOI is valid or not.

>>> is_valid('10.1209/0295-5075/111/40005')
True

>>> is_valid('10.1016.12.31/nature.S0735-1097(98)2000/12/31/34:7-7')
True

>>> is_valid('10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S')
True

>>> is_valid('10.1007/978-3-642-28108-2_19')
True

>>> is_valid('10.1007.10/978-3-642-28108-2_19')
True

>>> is_valid('10.1016/S0735-1097(98)00347-7')
True

>>> is_valid('10.1579/0044-7447(2006)35\[89:RDUICP\]2.0.CO;2')
True

>>> is_valid('<geo coords="10.4515260,51.1656910"></geo>')
False

libbmc.doi.to_canonical(urls)[source]¶

Convert a list of DOIs URLs to a list of canonical DOIs.

Parameters:	dois – A list of DOIs URLs. Can also be a single DOI URL.
Returns:	List of canonical DOIs (resp. a single value). `None` if an error occurred.

>>> to_canonical(['http://dx.doi.org/10.1209/0295-5075/111/40005'])
['10.1209/0295-5075/111/40005']

>>> to_canonical('http://dx.doi.org/10.1209/0295-5075/111/40005')
'10.1209/0295-5075/111/40005'

>>> to_canonical('aaaa') is None
True

>>> to_canonical(['aaaa']) is None
True

libbmc.doi.to_url(dois)[source]¶

Convert a list of canonical DOIs to a list of DOIs URLs.

Parameters:	dois – List of canonical DOIs. Can also be a single canonical DOI.
Returns:	A list of DOIs URLs (resp. a single value).

>>> to_url(['10.1209/0295-5075/111/40005'])
['http://dx.doi.org/10.1209/0295-5075/111/40005']

>>> to_url('10.1209/0295-5075/111/40005')
'http://dx.doi.org/10.1209/0295-5075/111/40005'

libbmc.fetcher module¶

This file contains functions to download locally some papers, eventually using a proxy.

libbmc.fetcher.download(url, proxies=None)[source]¶

Download a PDF or DJVU document from a url, eventually using proxies.

Params url:	The URL to the PDF/DJVU document to fetch.
Params proxies:	An optional list of proxies to use. Proxies will be used sequentially. Proxies should be a list of proxy strings. Do not forget to include `""` (empty string) in the list if you want to try direct fetching without any proxy.
Returns:	A tuple of the raw content of the downloaded data and its associated content-type. Returns `(None, None)` if it was unable to download the document.

>>> download("http://arxiv.org/pdf/1312.4006.pdf") 

libbmc.isbn module¶

This file contains all the ISBN-related functions.

libbmc.isbn.extract_from_text(text)[source]¶

Extract ISBNs from a text.

Parameters:	text – Some text.
Returns:	A list of canonical ISBNs found in the text.

>>> extract_from_text("978-3-16-148410-0 9783161484100 9783161484100aa abcd 0136091814 0136091812 9780136091817 123456789X")
['9783161484100', '9783161484100', '9783161484100', '0136091814', '123456789X']

libbmc.isbn.from_doi(doi_identifier)[source]¶

Make an ISBN out of the given DOI.

Note

Taken from https://github.com/xlcnd/isbnlib/issues/30#issuecomment-167444777.

Note

See https://github.com/xlcnd/isbnlib#note. The returned ISBN may not be issued yet (it is a valid one, but not necessary corresponding to a valid book).

Parameters:	doi_identifier – A valid canonical DOI.
Returns:	An ISBN string.

>>> from_doi('10.978.316/1484100')
'9783161484100'

libbmc.isbn.get_bibtex(isbn_identifier)[source]¶

Get a BibTeX string for the given ISBN.

Parameters:	isbn_identifier – ISBN to fetch BibTeX entry for.
Returns:	A BibTeX string or `None` if could not fetch it.

>>> get_bibtex('9783161484100')
'@book{9783161484100,\n     title = {Berkeley, Oakland: Albany, Emeryville, Alameda, Kensington},\n    author = {Peekaboo Maps},\n      isbn = {9783161484100},\n      year = {2009},\n publisher = {Peek A Boo Maps}\n}'

libbmc.isbn.is_valid(isbn_id)[source]¶

Check that a given string is a valid ISBN.

Parameters:	isbn_id – the isbn to be checked.
Returns:	boolean indicating whether the isbn is valid or not.

>>> is_valid("978-3-16-148410-0")
True

>>> is_valid("9783161484100")
True

>>> is_valid("9783161484100aa")
False

>>> is_valid("abcd")
False

>>> is_valid("0136091814")
True

>>> is_valid("0136091812")
False

>>> is_valid("9780136091817")
False

>>> is_valid("123456789X")
True

libbmc.isbn.to_doi(isbn_identifier)[source]¶

Make a DOI out of the given ISBN.

Note

See https://github.com/xlcnd/isbnlib#note. The returned DOI may not be issued yet.

Parameters:	isbn_identifier – A valid ISBN string.
Returns:	A DOI as string.

>>> to_doi('9783161484100')
'10.978.316/1484100'

libbmc.tools module¶

This file contains various utility functions.

libbmc.tools.batch(iterable, size)[source]¶

Get items from a sequence a batch at a time.

Params iterable:
	An iterable to get batches from.
Params size:	Size of the batches.
Returns:	A new batch of the given size at each time.

>>> [list(i) for i in batch([1, 2, 3, 4, 5], 2)]
[[1, 2], [3, 4], [5]]

libbmc.tools.clean_whitespaces(text)[source]¶

Remove multiple whitespaces from text. Also removes leading and trailing whitespaces.

Parameters:	text – Text to remove multiple whitespaces from.
Returns:	A cleaned text.

>>> clean_whitespaces("this  is    a text with    spaces")
'this is a text with spaces'

libbmc.tools.map_or_apply(function, param)[source]¶

Map the function on param, or apply it, depending whether param is a list or an item.

Parameters:	function – The function to apply. param – The parameter to feed the function with (list or item).
Returns:	The computed value or `None`.

libbmc.tools.remove_duplicates(some_list)[source]¶

Remove the duplicates from a list.

Parameters:	some_list – List to remove duplicates from.
Returns:	A list without duplicates.

>>> remove_duplicates([1, 2, 3, 1])
[1, 2, 3]

>>> remove_duplicates([1, 2, 1, 2])
[1, 2]

libbmc.tools.remove_urls(text)[source]¶

Remove URLs from a given text (only removes http, https and naked domains URLs).

Parameters:	text – The text to remove URLs from.
Returns:	The text without URLs.

>>> remove_urls("foobar http://example.com https://example.com foobar")
'foobar foobar'

libbmc.tools.replace_all(text, replace_dict)[source]¶

Replace multiple strings in a text.

Note

Replacements are made successively, without any warranty on the order in which they are made.

Parameters:	text – Text to replace in. replace_dict – Dictionary mapping strings to replace with their substitution.
Returns:	Text after replacements.

>>> replace_all("foo bar foo thing", {"foo": "oof", "bar": "rab"})
'oof rab oof thing'

libbmc.tools.slugify(value)[source]¶

Normalizes string, converts to lowercase, removes non-alpha characters, and converts spaces to hyphens to have nice filenames.

From Django’s “django/template/defaultfilters.py”.

>>> slugify("El pingüino Wenceslao hizo kilómetros bajo exhaustiva lluvia y frío, añoraba a su querido cachorro. ortez ce vieux whisky au juge blond qui fume sur son île intérieure, à Γαζέες καὶ μυρτιὲς δὲν θὰ βρῶ πιὰ στὸ χρυσαφὶ ξέφωτο いろはにほへとちりぬるを Pchnąć w tę łódź jeża lub ośm skrzyń fig กว่าบรรดาฝูงสัตว์เดรัจฉาน")
'El_pinguino_Wenceslao_hizo_kilometros_bajo_exhaustiva_lluvia_y_frio_anoraba_a_su_querido_cachorro_ortez_ce_vieux_whisky_au_juge_blond_qui_fume_sur_son_ile_interieure_a_Pchnac_w_te_odz_jeza_lub_osm_skrzyn_fig'

Module contents¶

libbmc

The libbmc is a generic Python library to manage bibliography and play with scientific papers.

libbmc package¶

Submodules¶

libbmc.bibtex module¶

libbmc.doi module¶

libbmc.fetcher module¶

libbmc.isbn module¶

libbmc.tools module¶

Module contents¶

Table Of Contents

This Page