Exploratory Data Analysis for Employee Retention Dataset

Employee turn-over is a very costly problem for companies. The cost of replacing an employee if often larger than 100K USD, taking into account the time spent to interview and find a replacement, placement fees, sign-on bonuses and the loss of productivity for several months. It is only natural then that data science has started being applied to this area. Understanding why and when employees are most likely to leave can lead to actions to improve employee retention as well […]

Read more

Additional tools for particle accelerator data analysis and machine information

This package is a collection of useful scripts and tools for the Optics Measurements and Corrections group (OMC) at CERN. Documentation Getting Started This package is Python 3.7+ compatible, and can be installed through pip: pip install pylhc One can also install from VCS: git clone https://github.com/pylhc/PyLHC pip install /path/to/PyLHC Or simply from the online master branch, which is stable: pip install git+https://github.com/pylhc/PyLHC.git#egg=pylhc After installing, scripts can be run with either python -m pylhc.SCRIPT –FLAG ARGUMENT or by calling the […]

Read more

An open source Python library for the interactive analysis of multidimensional datasets

HyperSpy HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets that can be described as multidimensional arrays of a given signal (for example, a 2D array of spectra, also known as a spectrum image). HyperSpy makes it straightforward to apply analytical procedures that operate on an individual signal to multidimensional arrays, as well as providing easy access to analytical tools that exploit the multidimensionality of the dataset. Its modular structure makes it easy to add […]

Read more

Provides APIs for scientific and bioinformatic data analysis

Toolchest provides APIs for scientific and bioinformatic data analysis. It allows you to abstract away the costliness of running tools on your own resources by running the same jobs on secure, powerful remote servers. Installation The Toolchest client is available on PyPI: pip install toolchest-client Usage Using a tool in Toolchest is as simple as: import toolchest_client as toolchest toolchest.set_key(“YOUR_TOOLCHEST_KEY”) toolchest.kraken2( tool_args=””, inputs=”path/to/input.fastq”, output_path=”path/to/output.fastq”, ) For a list of available tools, see the documentation. Configuration To use Toolchest, you must […]

Read more

Manage your exceptions in Python like a PRO

tryceratops Manage your exceptions in Python like a PRO. Installation and usage Installation pip install tryceratops Usage tryceratops [filename or dir…] You can enable experimental analyzers by running: tryceratops –experimental [filename or dir…] You can ignore specific violations by using: –ignore TCXXX repeatedly: tryceratops –ignore TC201 –ignore TC202 [filename or dir…] You can exclude dirs by using: –exclude dir/path repeatedly: tryceratops –exclude tests –exclude .venv [filename or dir…] Violations All violations and its descriptions can be found in docs. Ignoring […]

Read more

A toolkit to analyze time series data with python

Kats Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Time series analysis is an essential component of Data Science and Engineering work at industry, from understanding the key statistics and characteristics, detecting regressions and anomalies, to forecasting future trends. Kats aims to provide the one-stop shop for time series analysis, including detection, forecasting, feature extraction/embedding, multivariate analysis, etc. Kats is released by Facebook’s Infrastructure Data Science team. It […]

Read more

A Python library to analyze molecular dynamics trajectories

MDAnalysis MDAnalysis is a Python library for the analysis of computer simulations of many-body systems at the molecular scale, spanning use cases from interactions of drugs with proteins to novel materials. It is widely used in the scientific community and is written by scientists for scientists. It works with a wide range of popular simulation packages including Gromacs, Amber, NAMD, CHARMM, DL_Poly, HooMD, LAMMPS and many others — see the lists of supported trajectory formats and topology formats. MDAnalysis also […]

Read more

X-ray Analysis for Synchrotron Applications using Python

Larch Larch is an open-source library and set of applications for processing and analyzing X-ray absorption and fluorescence spectroscopy data and X-ray fluorescence and diffraction image data from synchrotron beamlines. It is especially focussed on X-ray absorption fine-structure spectroscopy (XAFS) including X-ray absorption near-edge spectroscopy (XANES) and extended X-ray absorption fine-structure spectroscopy (EXAFS). It also supports visualization and analysis tools for X-ray fluorescence (XRF) spectra and XRF and X-ray diffraction (XRD) images as collected at scanning X-ray microprobe beamlines. Larch […]

Read more

Sound Field Analysis toolbox for Python

Analyze, visualize and process sound field data recorded by spherical microphone arrays. The sound_field_analysis toolbox (short: sfa) is a Python port of the Sound Field Analysis Toolbox (SOFiA) toolbox, originally by Benjamin Bernschütz. The main goal of the sfa toolbox is to analyze, visualize and process sound field data recorded by spherical microphone arrays. Furthermore, various types of test-data may be generated to evaluate the implemented functions. It is an essential building block of ReTiSAR, an implementation of real time […]

Read more

A fast and lightweight server-side Web analytics solution

Ballcone Ballcone is a fast and lightweight server-side Web analytics solution. It requires no JavaScript on your website. Screenshots Design Goals Simplicity. Ballcone requires almost zero set-up as it prefers convention over configuration Efficiency. Ballcone performs lightning-fast analytic queries over data thanks to the underlying columnar database Specificity. Ballcone aims at providing visual insights on the HTTP access logs with no bloat Features No JavaScript snippets required GeoIP mapping with the GeoLite2 database Extraction of platform and browser information from […]

Read more
1 2 3