An open-source toolkit for entropic time series analysis

EntropyHub Information and uncertainty can be regarded as two sides of the same coin: the more uncertainty there is, the more information we gain by removing that uncertainty. In the context of information and probability theory, Entropy quantifies that uncertainty. The concept of entropy has its origins in classical physics under the second law of thermodynamics, a law considered to underpin our fundamental understanding of time in physics. Attempting to analyse the analog world around us requires that we measure […]

Read more

A security analytics platform built for cloud-focused security teams

panther-analysis Panther is a security analytics platform built for cloud-focused security teams. Panther enables teams to define detections as code and programmatically upload them to your Panther deployment. Quick Start # Clone the repository git clone [email protected]:panther-labs/panther-analysis.git cd panther-analysis # Configure your Python environment make install make venv source venv/bin/activate # Install dependencies and run your first test! make deps panther_analysis_tool test –path aws_cloudtrail_rules/ Getting Started The examples below demonstrate the local Panther workflow: # Run detection tests panther_analysis_tool test […]

Read more

Hyperbolic Dimensionality Reduction via Horospherical Projections

HoroPCA This code is the official PyTorch implementation of the ICML 2021 paper: HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical ProjectionsInes Chami*, Albert Gu*, Dat Nguyen*, Christopher RéStanford UniversityPaper: https://arxiv.org/abs/2106.03306 Abstract. This paper studies Principal Component Analysis (PCA) for data lying in hyperbolic spaces. Given directions, PCA relies on: (1) a parameterization of subspaces spanned by these directions, (2) a method of projection onto subspaces that preserves information in these directions, and (3) an objective to optimize, namely the variance explained […]

Read more

A Scanpy extension for analyzing single-cell immune-cell receptor sequencing data

Scirpy Scirpy is a scalable python-toolkit to analyse T cell receptor (TCR) or B cell receptor (BCR) repertoires from single-cell RNA sequencing (scRNA-seq) data. It seamlessly integrates with the popular scanpy library and provides various modules for data import, analysis and visualization. Getting started Please refer to the documentation. In particular, the In the documentation, you can also learn more about our immune-cell receptor model. Case-study The case study from our preprint is available here. Installation You need to have […]

Read more

When is programming needed in most leading Self Service configurations

To all Data Analysts big and small: Many Corporates typically have Self service BI and DWH solutions ( I am asking only about those who did NOT build an inhouse solution) :  -When is programming needed in most leading Self Service configurations? -When do analysts and Business executives require coding and programming when the Self service application, slice and dice, filtering and fields are not enough?! – IN SOME PLACES, us junior analysts are getting a feeling (that may be […]

Read more

An indispensable Python : Data sourcing to Data science.

Data analysis echo system has grown all the way from SQL’s to NoSQL and from Excel analysis to Visualization. Today, we are in scarceness of the resources to process ALL (You better understand what i mean by ALL) kind of data that is coming to enterprise. Data goes through profiling, formatting, munging or cleansing, pruning, transformation steps to analytics and predictive modeling. Interestingly, there is no one tool proved to be an effective solution to run all these operations { Don’t forget the […]

Read more

Machine Learning with Signal Processing Techniques

Stochastic Signal Analysis is a field of science concerned with the processing, modification and analysis of (stochastic) signals. Anyone with a background in Physics or Engineering knows to some degree about signal analysis techniques, what these technique are and how they can be used to analyze, model and classify signals. Data Scientists coming from a different fields, like Computer Science or Statistics, might not be aware of the analytical power these techniques bring with them. In this blog post, we […]

Read more

PixieDust Support of Streaming Data

With the rise of IoT devices (Internet of Things), being able to analyze and visualize live streams of data is becoming more and more important. For example, you could have sensors like thermometers in machines or portable medical devices like pacemakers, continuously streaming data to a streaming service like Kafka. PixieDust makes it easier to work with live data inside Jupyter Notebooks by providing simple integration APIs to both the PixieApp and display() framework.   On the visualization level, PixieDust […]

Read more
1 2 3 4