Random dataframe and database table generator

Random database/dataframe generator Often, beginners in SQL or data science struggle with the matter of easy access to a large sample database file (.DB or .sqlite) for practicing SQL commands. Would it not be great to have a simple tool or library to generate a large database with multiple tables, filled with data of one’s own choice? After all, databases break every now and then and it is safest to practice with a randomly generated one 🙂 While it is […]

Read more

A Freqtrade Framework & Strategy with python

MoniGoMani MoniGoMani aims to be more than just a conventional strategy, it’s a framework to “easily” find a profitable strategy configuration in any market! Without the need to do any programming. However, you will need to know some Technical Analysis and be able to pull your own conclusions from your test-results, this is not just an easy copy/paste. MGM (MoniGoMani) derives itself from other strategies by its use of something I called “weighted signals”. Each signal has its own weight […]

Read more

A python functions for robotic motion planning and task and motion planning

pybullet-planning (previously ss-pybullet) A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and motion planning (TAMP). This repository was originally developed for the PDDLStream (previously named STRIPStream) approach to TAMP. With the help of Yijiang Huang, a stable and documented fork of pybullet-planning named pybullet_planning is available through PyPI. However, new features will continue to be introduced first through pybullet-planning. Installation Install for macOS or Linux using: pybullet-planning is intended to have ongoing support for […]

Read more

How can generative adversarial networks learn real-life distributions easily

A Generative adversarial network, or GAN, is one of the most powerful machine learning models proposed by Goodfellow et al. for learning to generate samples from complicated real-world distributions. GANs have sparked millions of applications, ranging from generating realistic images or cartoon characters to text-to-image translations. Turing award laureate Yann LeCun called GANs “the most interesting idea in the last 10 years in ML.” In the context of generating images, GANs consist of two parts. 1) A parameterized (deconvolutional) generator […]

Read more

Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild

Episode 123 | June 10, 2021 In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies we create—through the lens of a social science that goes beyond the numbers to better understand people and society. Interviewed by Senior Principal Researcher Hunt Allcott, Economist David Rothschild discusses how the news media has evolved alongside social media  

Read more

Issue #134 – A Targeted Attack on Black-Box Neural MT

10 Jun21 Issue #134 – A Targeted Attack on Black-Box Neural MT in Robustness, The Neural MT Weekly Author: Dr. Karin Sim, Machine Translation Scientist @ Iconic Introduction Last week we looked at how neural machine translation (NMT) systems are naturally susceptible to gender bias. In today’s blog post we look at the vulnerability of an NMT system to targeted attacks, which could result in unsolicited or harmful translations. Specifically we report on work by Xu et al., 2021, which […]

Read more

A service for quick deploying and using dockerized Computer Vision models

Inferoxy Inferoxy is a service for quick deploying and using dockerized Computer Vision models. It’s a core of EORA’s Computer Vision platform Vision Hub that runs on top of AWS EKS. Why use it? You should use it if: You want to simplify deploying Computer Vision models with an appropriate Data Science stack to production:all you need to do is to build a Docker imagewith your model including any pre- and post-processing steps and push it into an accessible registry […]

Read more

A PyTorch Lightning solution to training CLIP from scratch

train-CLIP A PyTorch Lightning solution to training CLIP from scratch. Usage đźš‚ This training setup is easily usable right outside the box! Simply provide a training directory or your own dataset and we’ve got the rest covered. To train a model just specify a name from the paper name and tell us your training folder and batch size. All possible models can be seen in the yaml files in models/config python train.py –model_name RN50 –folder data_dir –batchsize 512 Training with […]

Read more

An open protocol for secure real-time exchange of large datasets

Delta Sharing Delta Sharing is an open protocol for secure real-time exchange of large datasets, which enables organizations to share data in real time regardless of which computing platforms they use. It is a simple REST protocol that securely shares access to part of a cloud dataset and leverages modern cloud storage systems, such as S3, ADLS, or GCS, to reliably transfer data. With Delta Sharing, a user accessing shared data can directly connect to it through pandas, Tableau, Apache […]

Read more

An in-progress web scraping project built with Python

New to Streaming Scraper An in-progress web scraping project built with Python, R, and SQL. A web scraping project that retrieves TV and movie data from two sources, then transforms and stores data in a MySQL database. Data are retrieved from two different data sources: What’s on Netflix (WON) and Rotten Tomatoes (RT). RT data are cleaned and transformed with Python, while WON data are cleaned and transformed with R. All data are piped into a MySQL database, then retrieved […]

Read more
1 674 675 676 677 678 991