libbmc package¶
Submodules¶
libbmc.bibtex module¶
This file contains functions to deal with Bibtex files and edit them.
TODO: Unittests
-
libbmc.bibtex.
append
(filename, data)[source]¶ Append some entries to a bibtex file.
Parameters: - filename – The name of the BibTeX file to edit.
- data – A
bibtexparser.BibDatabase
object.
-
libbmc.bibtex.
bibdatabase2bibtex
(data)[source]¶ Convert a BibDatabase object to a BibTeX string.
Parameters: data – A bibtexparser.BibDatabase
object.Returns: A formatted BibTeX string.
-
libbmc.bibtex.
delete
(filename, identifier)[source]¶ Delete an entry in a BibTeX file.
Parameters: - filename – The name of the BibTeX file to edit.
- identifier – The id of the entry to delete, in the BibTeX file.
-
libbmc.bibtex.
dict2bibtex
(data)[source]¶ Convert a single BibTeX entry dict to a BibTeX string.
Parameters: data – A dict representing BibTeX entry, as the ones from bibtexparser.BibDatabase.entries
output.Returns: A formatted BibTeX string.
-
libbmc.bibtex.
edit
(filename, identifier, data)[source]¶ Update an entry in a BibTeX file.
Parameters: - filename – The name of the BibTeX file to edit.
- identifier – The id of the entry to update, in the BibTeX file.
- data – A dict associating fields and updated values. Fields present in the BibTeX file but not in this dict will be kept as is.
-
libbmc.bibtex.
get
(filename, ignore_fields=None)[source]¶ Get all entries from a BibTeX file.
Parameters: - filename – The name of the BibTeX file.
- ignore_fields – An optional list of fields to strip from the BibTeX file.
Returns: A
bibtexparser.BibDatabase
object representing the fetched entries.
-
libbmc.bibtex.
get_entry
(filename, identifier, ignore_fields=None)[source]¶ Get an entry from a BibTeX file.
Parameters: - filename – The name of the BibTeX file.
- identifier – An id of the entry to fetch, in the BibTeX file.
- ignore_fields – An optional list of fields to strip from the BibTeX file.
Returns: A
bibtexparser.BibDatabase
object representing the fetched entry.None
if entry was not found.
-
libbmc.bibtex.
get_entry_by_filter
(filename, filter_function, ignore_fields=None)[source]¶ Get an entry from a BibTeX file.
Note
Returns the first matching entry.
Parameters: - filename – The name of the BibTeX file.
- filter_function – A function returning
True
orFalse
whether the entry should be included or not. - ignore_fields – An optional list of fields to strip from the BibTeX file.
Returns: A
bibtexparser.BibDatabase
object representing the first matching entry.None
if entry was not found.
-
libbmc.bibtex.
replace
(filename, identifier, data)[source]¶ Replace an entry in a BibTeX file.
Parameters: - filename – The name of the BibTeX file to edit.
- identifier – The id of the entry to replace, in the BibTeX file.
- data – A
bibtexparser.BibDatabase
object containing a single entry.
-
libbmc.bibtex.
to_filename
(data, mask='{first}_{last}-{journal}-{year}{arxiv_version}', extra_formatters=None)[source]¶ Convert a bibtex entry to a formatted filename according to a given mask.
Note
- Available formatters out of the box are:
journal
title
year
first
for the first authorlast
for the last authorauthors
for the list of authorsarxiv_version
(discarded if no arXiv version in the BibTeX)
Filename is slugified after applying the masks.
Parameters: - data – A
bibtexparser.BibDatabase
object representing a BibTeX entry, as the one frombibtexparser
output. - mask – A Python format string.
- extra_formatters – A dict of format string (in the mask) and associated lambdas to perform the formatting.
Returns: A formatted filename.
libbmc.doi module¶
This file contains all the DOI-related functions.
-
libbmc.doi.
extract_from_text
(text)[source]¶ Extract canonical DOIs from a text.
Parameters: text – The text to extract DOIs from. Returns: A list of found DOIs. >>> sorted(extract_from_text('10.1209/0295-5075/111/40005 10.1016.12.31/nature.S0735-1097(98)2000/12/31/34:7-7 10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S 10.1007/978-3-642-28108-2_19 10.1007.10/978-3-642-28108-2_19 10.1016/S0735-1097(98)00347-7 10.1579/0044-7447(2006)35\[89:RDUICP\]2.0.CO;2 <geo coords="10.4515260,51.1656910"></geo>')) ['10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S', '10.1007.10/978-3-642-28108-2_19', '10.1007/978-3-642-28108-2_19', '10.1016.12.31/nature.S0735-1097(98)2000/12/31/34:7-7', '10.1016/S0735-1097(98)00347-7', '10.1209/0295-5075/111/40005', '10.1579/0044-7447(2006)35\\[89:RDUICP\\]2.0.CO;2']
-
libbmc.doi.
get_bibtex
(doi)[source]¶ Get a BibTeX entry for a given DOI.
Note
Adapted from https://gist.github.com/jrsmith3/5513926.
Parameters: doi – The canonical DOI to get BibTeX from. Returns: A BibTeX string or None
.>>> get_bibtex('10.1209/0295-5075/111/40005') '@article{Verney_2015,\n\tdoi = {10.1209/0295-5075/111/40005},\n\turl = {http://dx.doi.org/10.1209/0295-5075/111/40005},\n\tyear = 2015,\n\tmonth = {aug},\n\tpublisher = {{IOP} Publishing},\n\tvolume = {111},\n\tnumber = {4},\n\tpages = {40005},\n\tauthor = {Lucas Verney and Lev Pitaevskii and Sandro Stringari},\n\ttitle = {Hybridization of first and second sound in a weakly interacting Bose gas},\n\tjournal = {{EPL}}\n}'
-
libbmc.doi.
get_linked_version
(doi)[source]¶ Get the original link behind the DOI.
Parameters: doi – A canonical DOI. Returns: The canonical URL behind the DOI, or None
.>>> get_linked_version('10.1209/0295-5075/111/40005') 'http://stacks.iop.org/0295-5075/111/i=4/a=40005?key=crossref.9ad851948a976ecdf216d4929b0b6f01'
-
libbmc.doi.
get_oa_policy
(doi)[source]¶ Get OA policy for a given DOI.
Note
Uses beta.dissem.in API.
Parameters: doi – A canonical DOI. Returns: The OpenAccess policy for the associated publications, or None
if unknown.>>> tmp = get_oa_policy('10.1209/0295-5075/111/40005'); (tmp["published"], tmp["preprint"], tmp["postprint"], tmp["romeo_id"]) ('can', 'can', 'can', '1896')
>>> get_oa_policy('10.1215/9780822387268') is None True
-
libbmc.doi.
get_oa_version
(doi)[source]¶ Get an OA version for a given DOI.
Note
Uses beta.dissem.in API.
Parameters: doi – A canonical DOI. Returns: The URL of the OA version of the given DOI, or None
.>>> get_oa_version('10.1209/0295-5075/111/40005') 'http://arxiv.org/abs/1506.06690'
-
libbmc.doi.
is_valid
(doi)[source]¶ Check that a given DOI is a valid canonical DOI.
Parameters: doi – The DOI to be checked. Returns: Boolean indicating whether the DOI is valid or not. >>> is_valid('10.1209/0295-5075/111/40005') True
>>> is_valid('10.1016.12.31/nature.S0735-1097(98)2000/12/31/34:7-7') True
>>> is_valid('10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S') True
>>> is_valid('10.1007/978-3-642-28108-2_19') True
>>> is_valid('10.1007.10/978-3-642-28108-2_19') True
>>> is_valid('10.1016/S0735-1097(98)00347-7') True
>>> is_valid('10.1579/0044-7447(2006)35\[89:RDUICP\]2.0.CO;2') True
>>> is_valid('<geo coords="10.4515260,51.1656910"></geo>') False
-
libbmc.doi.
to_canonical
(urls)[source]¶ Convert a list of DOIs URLs to a list of canonical DOIs.
Parameters: dois – A list of DOIs URLs. Can also be a single DOI URL. Returns: List of canonical DOIs (resp. a single value). None
if an error occurred.>>> to_canonical(['http://dx.doi.org/10.1209/0295-5075/111/40005']) ['10.1209/0295-5075/111/40005']
>>> to_canonical('http://dx.doi.org/10.1209/0295-5075/111/40005') '10.1209/0295-5075/111/40005'
>>> to_canonical('aaaa') is None True
>>> to_canonical(['aaaa']) is None True
-
libbmc.doi.
to_url
(dois)[source]¶ Convert a list of canonical DOIs to a list of DOIs URLs.
Parameters: dois – List of canonical DOIs. Can also be a single canonical DOI. Returns: A list of DOIs URLs (resp. a single value). >>> to_url(['10.1209/0295-5075/111/40005']) ['http://dx.doi.org/10.1209/0295-5075/111/40005']
>>> to_url('10.1209/0295-5075/111/40005') 'http://dx.doi.org/10.1209/0295-5075/111/40005'
libbmc.fetcher module¶
This file contains functions to download locally some papers, eventually using a proxy.
-
libbmc.fetcher.
download
(url, proxies=None)[source]¶ Download a PDF or DJVU document from a url, eventually using proxies.
Params url: The URL to the PDF/DJVU document to fetch. Params proxies: An optional list of proxies to use. Proxies will be used sequentially. Proxies should be a list of proxy strings. Do not forget to include ""
(empty string) in the list if you want to try direct fetching without any proxy.Returns: A tuple of the raw content of the downloaded data and its associated content-type. Returns (None, None)
if it was unable to download the document.>>> download("http://arxiv.org/pdf/1312.4006.pdf")
libbmc.isbn module¶
This file contains all the ISBN-related functions.
-
libbmc.isbn.
extract_from_text
(text)[source]¶ Extract ISBNs from a text.
Parameters: text – Some text. Returns: A list of canonical ISBNs found in the text. >>> extract_from_text("978-3-16-148410-0 9783161484100 9783161484100aa abcd 0136091814 0136091812 9780136091817 123456789X") ['9783161484100', '9783161484100', '9783161484100', '0136091814', '123456789X']
-
libbmc.isbn.
from_doi
(doi_identifier)[source]¶ Make an ISBN out of the given DOI.
Note
Taken from https://github.com/xlcnd/isbnlib/issues/30#issuecomment-167444777.
Note
See https://github.com/xlcnd/isbnlib#note. The returned ISBN may not be issued yet (it is a valid one, but not necessary corresponding to a valid book).
Parameters: doi_identifier – A valid canonical DOI. Returns: An ISBN string. >>> from_doi('10.978.316/1484100') '9783161484100'
-
libbmc.isbn.
get_bibtex
(isbn_identifier)[source]¶ Get a BibTeX string for the given ISBN.
Parameters: isbn_identifier – ISBN to fetch BibTeX entry for. Returns: A BibTeX string or None
if could not fetch it.>>> get_bibtex('9783161484100') '@book{9783161484100,\n title = {Berkeley, Oakland: Albany, Emeryville, Alameda, Kensington},\n author = {Peekaboo Maps},\n isbn = {9783161484100},\n year = {2009},\n publisher = {Peek A Boo Maps}\n}'
-
libbmc.isbn.
is_valid
(isbn_id)[source]¶ Check that a given string is a valid ISBN.
Parameters: isbn_id – the isbn to be checked. Returns: boolean indicating whether the isbn is valid or not. >>> is_valid("978-3-16-148410-0") True
>>> is_valid("9783161484100") True
>>> is_valid("9783161484100aa") False
>>> is_valid("abcd") False
>>> is_valid("0136091814") True
>>> is_valid("0136091812") False
>>> is_valid("9780136091817") False
>>> is_valid("123456789X") True
-
libbmc.isbn.
to_doi
(isbn_identifier)[source]¶ Make a DOI out of the given ISBN.
Note
See https://github.com/xlcnd/isbnlib#note. The returned DOI may not be issued yet.
Parameters: isbn_identifier – A valid ISBN string. Returns: A DOI as string. >>> to_doi('9783161484100') '10.978.316/1484100'
libbmc.tools module¶
This file contains various utility functions.
-
libbmc.tools.
batch
(iterable, size)[source]¶ Get items from a sequence a batch at a time.
Params iterable: An iterable to get batches from. Params size: Size of the batches. Returns: A new batch of the given size at each time. >>> [list(i) for i in batch([1, 2, 3, 4, 5], 2)] [[1, 2], [3, 4], [5]]
-
libbmc.tools.
clean_whitespaces
(text)[source]¶ Remove multiple whitespaces from text. Also removes leading and trailing whitespaces.
Parameters: text – Text to remove multiple whitespaces from. Returns: A cleaned text. >>> clean_whitespaces("this is a text with spaces") 'this is a text with spaces'
-
libbmc.tools.
map_or_apply
(function, param)[source]¶ Map the function on
param
, or apply it, depending whetherparam
is a list or an item.Parameters: - function – The function to apply.
- param – The parameter to feed the function with (list or item).
Returns: The computed value or
None
.
-
libbmc.tools.
remove_duplicates
(some_list)[source]¶ Remove the duplicates from a list.
Parameters: some_list – List to remove duplicates from. Returns: A list without duplicates. >>> remove_duplicates([1, 2, 3, 1]) [1, 2, 3]
>>> remove_duplicates([1, 2, 1, 2]) [1, 2]
-
libbmc.tools.
remove_urls
(text)[source]¶ Remove URLs from a given text (only removes http, https and naked domains URLs).
Parameters: text – The text to remove URLs from. Returns: The text without URLs. >>> remove_urls("foobar http://example.com https://example.com foobar") 'foobar foobar'
-
libbmc.tools.
replace_all
(text, replace_dict)[source]¶ Replace multiple strings in a text.
Note
Replacements are made successively, without any warranty on the order in which they are made.
Parameters: - text – Text to replace in.
- replace_dict – Dictionary mapping strings to replace with their substitution.
Returns: Text after replacements.
>>> replace_all("foo bar foo thing", {"foo": "oof", "bar": "rab"}) 'oof rab oof thing'
-
libbmc.tools.
slugify
(value)[source]¶ Normalizes string, converts to lowercase, removes non-alpha characters, and converts spaces to hyphens to have nice filenames.
From Django’s “django/template/defaultfilters.py”.
>>> slugify("El pingüino Wenceslao hizo kilómetros bajo exhaustiva lluvia y frío, añoraba a su querido cachorro. ortez ce vieux whisky au juge blond qui fume sur son île intérieure, à Γαζέες καὶ μυρτιὲς δὲν θὰ βρῶ πιὰ στὸ χρυσαφὶ ξέφωτο いろはにほへとちりぬるを Pchnąć w tę łódź jeża lub ośm skrzyń fig กว่าบรรดาฝูงสัตว์เดรัจฉาน") 'El_pinguino_Wenceslao_hizo_kilometros_bajo_exhaustiva_lluvia_y_frio_anoraba_a_su_querido_cachorro_ortez_ce_vieux_whisky_au_juge_blond_qui_fume_sur_son_ile_interieure_a_Pchnac_w_te_odz_jeza_lub_osm_skrzyn_fig'