AI & Machine Learning News. 27, April 2020
The Artificial Intelligence and Machine Learning News clippings for Quants are provided algorithmically with CloudQuant’s NLP engine which seeks out articles relevant to our community and ranks them by our proprietary interest score. After all, shouldn’t you expect to see the news generated using AI?
StarGAN v2: Diverse Image Synthesis for Multiple Domains
Paper (arXiv): https://arxiv.org/abs/1912.01865
Paper (PDF): https://arxiv.org/pdf/1912.01865.pdf
Code (GitHub): https://github.com/clovaai/stargan-v2
Authors: Yunjey Choi* (Clova AI Research, NAVER) Youngjung Uh* (Clova AI Research, NAVER) Jaejun Yoo* (EPFL) Jung-Woo Ha (Clova AI Research, NAVER) (* indicates equal contribution)
Abstract: A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain variations. The code, pre-trained models, and dataset are available at github.com/clovaai/stargan-v2.
Stanford CS229: Machine Learning Andrew Ng | Autumn 2018 – 20 videos
Full Playlist Set of 20 videos. Here is the course website with problem sets, syllabus, slides and class notes. The problem sets seemed to be locked, but they are easily found in GitHub. For instance, this repo has all the problem sets for the autumn 2018 session.
FMSB Reviews Algo Trading and Machine Learning
The FICC Markets Standards Board (FMSB) has today published its first Spotlight Review, looking at emerging themes and challenges in algorithmic trading and machine learning.
This Spotlight Review highlights important emerging issues in this area to assist market participants in considering how to address challenges that may arise.
This Spotlight Review considers:
- Managing model risk in algorithmic trading;
- Challenges for algorithmic market making in less liquid instruments;
- Adoption of machine learning in algorithmic market making;
- Increased use of execution algorithms; and
- Best practice, and the role for practitioner-led solutions.
2020-04-23 09:40:53+00:00 Read the full story…
Weighted Interest Score: 5.4743, Raw Interest Score: 1.8420,
Positive Sentiment: 0.2498, Negative Sentiment 0.1873
CloudQuant Thoughts : Jump straight to the PDF at this link if you wish.
Phased Opening for NYSE Floor Talked
It all started under a Buttonwood tree back in 2008. Or was that 228 years ago?
It wasn’t simply a landscaping flourish or a nod to history that prompted NYSE Euronext executives to plant a group of buttonwood trees outside their massive new data center in Mahwah, N.J. in the spring of 2010.
According to Wall Street lore, it was under a buttonwood tree–better known as the sycamore–that 24 brokers formed the New York Stock Exchange in 1792. By planting six of the trees in Mahwah, the exchange operator was, of course, paying its respects to its heritage. But also, and more significantly, it was signaling that a new type of market center was being born.
The New York Stock Exchange might (Physically) reopen in phases after May 15, two sources who were on a conference call with NYSE Chief Operating Officer Michael Blaugrund told CNN Business. The sources also said that the exact timing is possible to revision.
2020-04-24 14:30:45+00:00 Read the full story…
Weighted Interest Score: 4.4112, Raw Interest Score: 1.5534,
Positive Sentiment: 0.1146, Negative Sentiment 0.0962
CloudQuant Thoughts : We all want a return to normalcy but we are also data-centric souls and as long as the chart that we showed on our Alternative Data blog post last week continues to look like a rocket taking off to the moon, any talk of re-opening the Physical NYSE Exchange floor is premature. Sorry.
Why Alternative Data is Key to Analyzing the Consumer Sector • Integrity Research
Corporates in the consumer and retail sectors are increasing their sophistication as the industry is forcing them to leverage big data to be successful. Unfortunately, most Wall Street analysts have yet to catch up.
Consumer companies have become increasingly data driven. To be successful, they must understand point of sale, credit card transactions, web site virality, quality of impressions, and brand perception on social networks. They need to optimize how and where their products are stocked. Real-time factors of success are pushing the product cycles faster than ever.
Meanwhile, Wall Street has struggled a bit to keep up. Analysts continue to be focused on traditional metrics: sales per square foot, sales per employee, comparable store growth, and inventory turns. Suddenly beating or missing a quarter is very short term and can be misguided.. It is more critical to see where a company is headed in the next few seasons and are they equipped with the right design teams, the best products for the category. Do they keep the attention of their core audience and aspirational customers?
2020-04-27 05:41:00+00:00 Read the full story…
Weighted Interest Score: 4.0581, Raw Interest Score: 1.7606,
Positive Sentiment: 0.2622, Negative Sentiment 0.1873
CloudQuant Thoughts : Knowing that Alternative Data is the Key is one thing, if you cannot fathom how to use the key you are still in limbo. Head over to our Data Catalog to see, not only some of the best alternative datasets available, but also the code and the data to back it up. Yes, no longer reading through unweildy White papers trying to work out what they did and with what data, we provide the white paper, the code and the data! Reproducible results.
Google claims its AI can design computer chips in under 6 hours
In a preprint paper coauthored by Google AI lead Jeff Dean, scientists at Google Research and the Google chip implementation and infrastructure team describe a learning-based approach to chip design that can learn from past experience and improve over time, becoming better at generating architectures for unseen components. They claim it completes designs in under six hours on average, which is significantly faster than the weeks it takes human experts in the loop.
While the work isn’t entirely novel — it builds upon a technique proposed by Google engineers in a paper published in March — it advances the state of the art in that it implies the placement of on-chip transistors can be largely automated. If made publicly available, the Google researchers’ technique could enable cash-strapped startups to develop their own chips for AI and other specialized purposes. Moreover, it could help to shorten the chip design cycle to allow hardware to better adapt to rapidly evolving research.
2020-04-23 00:00:00 Read the full story…
Weighted Interest Score: 2.3550, Raw Interest Score: 1.4778,
Positive Sentiment: 0.3855, Negative Sentiment 0.0643
CloudQuant Thoughts : This is amazing, weeks down to 6 hours! “We can essentially have a machine learning model that learns to play the game of [component] placement for a particular chip.” At the same time it is optimizing for Power, Performance and Area! AMAZING!
How Microsoft Is Using ML To Secure Its Software Development Cycle
Tech giant Microsoft recently built a machine learning classification system which aims to secure the software development lifecycle. The machine learning system helps in classifying bugs as security or non-security and critical or non-critical. This provides a level of accuracy, akin to that provided by security experts.
The software developers at Microsoft address several issues and vulnerabilities. More than 45,000 developers generate nearly 30,000 bugs per month, which gets stored across 100+ AzureDevOps and GitHub repositories. The tech giant is looking to mitigate these vulnerabilities.
Since 2001, the tech giant has collected 13 million work items and bugs. According to sources, Microsoft spends an estimated $150,000 per issue as a whole to mitigate bugs and vulnerabilities.
However, according to the developers, since there are more than 45,000 developers already working to address the problem, applying more human resources to better label and prioritise the bugs is not possible.
To build the machine learning model, the tech giant used 13 Million work items and bugs to train the model which they had collected for two decades. They stated, “We used that data to develop a process and machine learning model that correctly distinguishes between security and non-security bugs 99% of the time, and accurately identifies the critical, high priority security bugs 97% of the time.”
2020-04-26 05:30:00+00:00 Read the full story…
Weighted Interest Score: 2.9145, Raw Interest Score: 2.3181,
Positive Sentiment: 0.0247, Negative Sentiment 0.2219
CloudQuant Thoughts : I have tried KITE and it was ok… It makes sense for AI to be looking at my code and searching repositories for answers to the questions I may come up with. Whilst I am sure this will still take some time to perfect, I can see a point in the future where programming is… PLEASANT!
Codota raises $12 million for AI that suggests and autocompletes code
Companies like Codota seem to be getting a lot of investor attention lately, and there’s a reason. According to a study published by the University of Cambridge’s Judge Business School, programmers spend 50.1% of their work time not programming; the other half is debugging. And the total estimated cost of debugging is $312 billion per year. AI-powered code suggestion and review tools, then, promise to cut development costs substantially while enabling coders to focus on more creative, less repetitive tasks.
2020-04-27 00:00:00 Read the full story…
Weighted Interest Score: 3.3450, Raw Interest Score: 1.7099,
Positive Sentiment: 0.0950, Negative Sentiment 0.1267
CloudQuant Thoughts : Another One!
HAS AI FAILED US DURING THIS CRISIS?
The hype around artificial intelligence is under the scanner as the technology has not made a big impact in the fight against COVID-19. Undoubtedly, AI has taken the central stage within various organisations to drive business growth, but its effectiveness in a wide range of use cases is yet again being questioned. This is because researchers have failed to bring anything on the table that could significantly help the world fight COVID-19.
Today, the world needs AI more than ever to slow the spread of the deadly virus and, in turn, save thousands of lives. Has AI ultimately failed us all during the COVID-19 crisis?
2020-04-21 00:00:00 Read the full story…
Nvidia launches Project MONAI AI framework for health care research in alpha
Nvidia, in conjunction with King’s College London, announced the open source alpha release of Project MONAI today, a framework for health care research that’s available now on GitHub. MONAI stands for Medical Open Network for AI. The framework is optimized for the demands of health care researchers and made for running with deep learning frameworks like PyTorch and Ignite. A main goal of the MONAI framework is to help researchers reproduce their experiments in order to build upon each other’s work. One example in the alpha release is data augmentation during training, with defined interfaces to control random states and ensure training results stay the same, Nvidia VP of healthcare Kimberly Powell told VentureBeat in an email.
2020-04-21 00:00:00 Read the full story…
Ensuring Business Continuity with AI during the recession
COVID-19 has created an unprecedented situation across the world – one that has cast doubts on business continuity in many industries. The following adjustments that had to be made should inform all organizations to have a robust business continuity plan in place that can weather uncommon situations like the one we are in.
Progressions in machine learning (ML) and artificial intelligence (AI) are expected to help businesses keep afloat as they try to endure the impacts of a monetary downturn. As recession looms, business management teams look within their operations to understand what they can utilize – concentrating on holding current clients, bringing best-case deals in the pipeline, and ensuring money reserves.
Many will depend on trimming costs to respond to damages or defer interests in innovation that had been estimated to drive advancement. Companies can be predictive and dynamic in their decision-making to save business continuity and achieve operational resilience. They can deploy a virtual workforce program that empowers all of their worldwide representatives to telecommute, with almost no interruption to the business tasks.
2020-04-20 00:00:00 Read the full story…
CARA (Computer Assisted Retinal Analysis) AI Application for early onset of blindness caused by Diabetes to assess the effects of COVID-19 on patients
DIAGNOS Inc. a leader in early detection of critical health issues using advanced Artificial Intelligence (AI) tools, provides an update on its CARA (Computer Assisted Retinal Analysis) AI Application, following the Corporation’s participation in the US White House call to action data analysis program (“Program”), as referred to in the Corporation’s March 25, 2020 press release. Pursuant to the Corporation’s analysis of the technical details sourced from the Program, it is developing a new add-on test to CARA to provide an innovative solution for COVID-19 patients by offering a means to monitor health through retina analysis. DIAGNOS’ scientific team has been able to link the Corporation’s core business application of its CARA technology for early onset of blindness caused by Diabetes to assess the effects of COVID-19 on patients.
2020-04-27 00:00:00 Read the full story…
Weighted Interest Score: 3.0030, Raw Interest Score: 1.3526,
Positive Sentiment: 0.2405, Negative Sentiment 0.1202
Beginner’s Guide to Exploratory Data Analysis on Text Data – The Importance of Exploratory Data Analysis (EDA)
There are no shortcuts in a machine learning project lifecycle. We can’t simply skip to the model building stage after gathering the data. We need to plan our approach in a structured manner and the exploratory data analytics (EDA) stage plays a huge part in that. I can say this with the benefit of hindsight having personally gone through this situation plenty of times.
In my early days in this field, I couldn’t wait to dive into machine learning algorithms but that often left my end result hanging in the balance. I discovered, through personal experience and the advice of my mentors, the importance of spending time exploring and understanding my data.
2020-04-26 19:00:54+00:00 Read the full story…
Weighted Interest Score: 2.7487, Raw Interest Score: 1.2202,
Positive Sentiment: 0.1581, Negative Sentiment 0.1581
Prometeia Turkey expands offer with new Data Science team
Prometeia Turkey expands its expertise with a data science team targeting many different sectors, with a main focus in financial sector companies. It will do so thanks to a newly established local seasoned staff of AI and business experts – led by Seçil Arslan, joining Prometeia after seven years at YapÄ± Kredi’s R&D team – that have applied AI experience and will provide fast delivery of end-to-end custom AI solutions that enable the digitalization of business workflows, supported by the Data Science competence centre in Italy.
As Artificial Intelligence and Data Science techniques empower companies in their digital transformation and growth roadmaps, Prometeia intends to support this dramatic transformation in the Turkish and Middle East markets.
Our Data Science team in Turkey aims to bring innovational AI technologies and data science solutions together with the following five main capabilities:
2020-04-27 00:00:00 Read the full story…
Weighted Interest Score: 4.7158, Raw Interest Score: 2.2031,
Positive Sentiment: 0.0958, Negative Sentiment 0.0319
What Is Narrow AI & How It Is Different From Artificial General Intelligence
When Alan Turing first thought of coming up with machines that could think like humans, he was probably thinking about machines that could one day make the life of human beings easier. Fast forward 70 years, and AI has been able to perform tasks that have undoubtedly made life more comfortable. Conversational AI, flying drones, bots, language translation, facial recognition, etc., are some of the most promising AI applications we have today. But these fall under Narrow AI rather than the Artificial General Intelligence, which is something different.
What Is Narrow AI?
As the definition goes, narrow AI is a specific type of artificial intelligence in which technology outperforms humans in a narrowly defined task. It focuses on a single subset of cognitive abilities and advances in that spectrum.
Over the years, narrow AI has outperformed humans at certain tasks. These include calculations and quantification that have been performed more efficiently with this technology. Today, it has also outperformed human beings in complex games like Go and chess, along with helping make intelligent business decisions, and more.
After narrow AI trumped human performance, the next step came in the form of general AI.
2020-04-26 08:30:00+00:00 Read the full story…
Weighted Interest Score: 4.5656, Raw Interest Score: 1.5658,
Positive Sentiment: 0.4697, Negative Sentiment 0.1827
Scorable launches second credit risk analysis product
Scorable has launched a second product to enhance the scope and accuracy of its credit risk analysis, helping fixed-income managers make better investment decisions.
The company’s innovative artificial intelligence (AI) solution enables asset managers to monitor corporate bonds and credit spreads and to anticipate rating changes before they occur or markets price them in.
With Covid-19 and the oil price collapse causing massive turmoil in financial markets, careful risk management is more important than ever. More than $92 billion of corporate debt fell to high yield from investment grade in March, and an end to the downward spiral is not in sight. Over the next few weeks, the number of issuers that lose their investment grade rating – so-called “fallen angels” – will continue to increase.
2020-04-27 00:00:00 Read the full story…
Weighted Interest Score: 4.5218, Raw Interest Score: 2.3256,
Positive Sentiment: 0.4104, Negative Sentiment 0.2462
Machine Learning using C++ for Linear and Logistic Regression
The applications of machine learning transcend boundaries and industries so why should we let tools and languages hold us back? Yes, Python is the language of choice in the industry right now but a lot of us come from a background where Python isn’t taught!
The computer science faculty in universities are still teaching programming in C++ – so that’s what most of us end up learning first. I understand why you should learn Python – it’s the primary language in the industry and it has all the libraries you need to get started with machine learning.
But what if your university doesn’t teach it? Well – that’s what inspired me to dig deeper and use C++ for building machine learning algorithms. So if you’re a college student, a fresher in the industry, or someone who’s just curious about picking up a different language for machine learning – this tutorial is for you!
In this first article of my series on machine learning using C++, we will start with the basics. We’ll understand how to implement linear regression and logistic regression using C++!
2020-04-22 01:42:10+00:00 Read the full story…
Weighted Interest Score: 4.3269, Raw Interest Score: 1.8530,
Positive Sentiment: 0.0750, Negative Sentiment 0.2785
How AutoML 2.0 Offers Its Two-Fold Advantage To Traditional Data Science
There has been rapid growth and advancements in AutoML systems over the last few years. AutoML automates the full development lifecycle for enterprise AI and ML applications, and makes it possible for a data scientist to automate the optimisation and selection of ML models, but it does encounter some limitations. Now, with the next version, AutoML 2.0, these systems plan to automate the most complicated, and time-consuming part of the enterprise AI development lifecycle – feature engineering, which typically takes months using traditional methods.
The previous version of the AutoML platforms has been more about automating the machine learning part of data science. But, one of the most challenging parts of traditional data science is feature engineering, which involves a lot of manual activity. Feature engineering consists of connecting data and building a feature data table with a set of diverse features that will be evaluated against multiple machine learning algorithms. The problem with feature engineering is that it requires high domain expertise as it involves ideating new features. This involves a lot of iteration as features are evaluated and rejected or chosen. Now, platforms with automated feature engineering capabilities allow for automated creation of feature tables from relational data sources and flat files. This ability to generate features automatically in data science is impactful and game-changing.
Not only automation, but AutoML 2.0 will also offer BI analysts, data engineers and others in an organisation with deep domain knowledge to contribute towards the development of ML and AI models. With automation in feature engineering, BI teams have the opportunity to develop sophisticated algorithms in a matter of days.
2020-04-26 04:30:00+00:00 Read the full story…
Weighted Interest Score: 4.2264, Raw Interest Score: 2.3998,
Positive Sentiment: 0.2969, Negative Sentiment 0.2227
Singapore central bank backs Tradeteq quantum credit scoring project
London-based Tradeteq has received funding from Singapore’s central bank on a project to develop quantum computing-based credit scoring methods for companies.
The exploratory research, undertaken in collaboration with Singapore Management University (SMU), is supported by the Monetary Authority of Singapore under the Financial Sector Technology & Innovation (FSTI) – Artificial Intelligence and Data Analytics (AIDA) Grant Scheme.
SMU and Tradeteq’s objective is to build a predictive machine learning model which has the potential to improve credit scoring accuracy. The model will be implemented on both a quantum computer and a simulated quantum computer.
2020-04-27 09:20:00 Read the full story…
Weighted Interest Score: 3.9396, Raw Interest Score: 2.1053,
Positive Sentiment: 0.4605, Negative Sentiment 0.0000
MindsDB, AutoML Startup, Gains Seed Funding
The maintainers of an open source framework for automating machine learning projects have raised its profile with a comparatively modest but strategic funding round led by an investor with ties to a string of emerging AI efforts springing from the University of California at Berkeley.
The $3 million seed funding round announced on April 16 was led by OpenOcean. The Finnish-based fund is headed by Patrik Backman, who helped lead earlier open source projects such as MySQL and MariaDB. MindDB has so far raised $4.2 million, according to the venture capital tracking web site Crunchbase.com.
The startup’s autoML framework aims to streamline the use of neural networks while making it easier for developers to integrate machine learning into production workloads. The framework emphasizes AI explainability and trust, allowing developers to select data needed for forecast, then automating the analytics process.
2020-04-20 00:00:00 Read the full story…
Weighted Interest Score: 3.7852, Raw Interest Score: 1.8224,
Positive Sentiment: 0.0388, Negative Sentiment 0.0388
What to Look for When Modernizing the Data Lake
Data lake adoption has more than doubled over the past three years. The technologies and best practices surrounding data lakes continue to evolve – and so do the challenges.
Currently in use by 45% of DBTA subscribers to support data science, data discovery and real-time analytics initiatives, data lakes are still underpinned by Hadoop in many cases, although cloud-native approaches are on the rise. From data governance and security, to data integration and architecture, new approaches are required for success.
DBTA recently held a webinar with Ali LeClerc. director of product marketing, Alluxio, and Ritu Jain, director of product marketing, Qlik, who discussed how leading companies are optimizing their data lakes for speed, scale, and agility.
2020-04-24 00:00:00 Read the full story…
Weighted Interest Score: 3.6802, Raw Interest Score: 2.0305,
Positive Sentiment: 0.1692, Negative Sentiment 0.2115
MIT aims for energy efficiency in AI model training
In a newly published paper, MIT researchers propose a system for training and running AI models in a way that’s more environmentally friendly than previous approaches. They claim it can cut down on the pounds of carbon emissions involved to “low triple digits” in some cases, mainly by improving the computational efficiency of the models.
Impressive feats have been achieved with AI across domains like image synthesis, protein modeling, and autonomous driving, but the technology’s sustainability issues remain largely unresolved. Last June, researchers at the University of Massachusetts at Amherst released a report estimating that the amount of power required for training and searching a certain model involves the emissions of roughly 626,000 pounds of carbon dioxide — equivalent to nearly 5 times the lifetime emissions of the average U.S. car.
2020-04-23 00:00:00 Read the full story…
Weighted Interest Score: 3.6339, Raw Interest Score: 1.9420,
Positive Sentiment: 0.4923, Negative Sentiment 0.0821
Why Your Company Needs White-Box Models in Enterprise Data Science
AI is having a profound impact on customer experience, revenue, operations, risk management and other business functions across multiple industries. When fully operationalized, AI and Machine Learning (ML) enable organizations to make data-driven decisions with unprecedented levels of speed, transparency, and accountability. This dramatically accelerates digital transformation initiatives delivering greater performance and a competitive edge to organizations. ML projects in data science labs tend to adopt black-box approaches that generate minimal actionable insights and result in a lack of accountability in the data-driven decision-making process. Today with the advent of AutoML 2.0 platforms, a white-box model approach is becoming increasingly important and possible.
White-box models (WBMs) provide clear explanations of how they behave, how they produce predictions, and what variables influenced the model. WBMs are preferred in many enterprise use cases because of their transparent ‘inner-working’ modeling process and easily interpretable behavior. For example, linear models and decision/regression tree models are fairly transparent, one can easily explain how these models generate predictions. WBMs render not only prediction results but also influencing variables, delivering greater impact to a wider range of participants in enterprise AI projects.
Data scientists are often math and statistics specialists and create complex features using highly-nonlinear transformations. These types of features may be highly correlated with the prediction target but are not easily explainable from the perspective of customer behaviors. Deep learning (neural networks) computationally generates features, but such “black-box” features are understandable neither quantitatively nor qualitatively. These statistical or mathematical feature-based models are at the heart of black-box models. Deep learning (neural network), boosting, and random forest models are highly non-linear by nature and are harder to explain, also making them “black-box.”
2020-04-23 21:30:25+00:00 Read the full story…
Weighted Interest Score: 3.4738, Raw Interest Score: 1.9760,
Positive Sentiment: 0.2640, Negative Sentiment 0.1789
7 Key Benefits of Proper Data Lake Ingestion
Data lake ingestion is so important for properly maintaining and understanding your data. Here are the most powerful benefits of proper data lake ingestion.
It’s impossible to deny the importance of data in several industries, but that data can get overwhelming if it isn’t properly managed. The problem is that managing and extracting valuable insights from all this data needs exceptional data collecting, which makes data ingestion vital. The following will highlight seven key benefits of proper ingestion.
- Proper Scalability
- Covering Data Types
- Capturing High-Velocity Data
- Sanitizing Data
- Data Analytics Simplified
- Stores in Raw Format
- Uses Powerful Algorithms
2020-04-24 19:01:24+00:00 Read the full story…
Weighted Interest Score: 3.2883, Raw Interest Score: 1.5679,
Positive Sentiment: 0.3920, Negative Sentiment 0.2831
Record Demand For Data Due to Covid-19
Volatility caused by the Covid-19 pandemic has led to record data usage according to provider Refinitiv with a 50% increase in mobile usage as staff are forced to work remotely.
Andrea Remyn Stone, chief customer proposition officer at Refinitiv, told Markets Media: “We have seen record data usage during the pandemic, with some interesting trends in the ‘data on the data’.”
She continued that, for example, there has been an eightfold increase in demand for mortgage data. There has also been more demand for debt data such as leveraged loans, corporate bonds and credit profiles.
“There has been a 20% increase in web usage and 50% on mobile usage,” Remyn Stone added. “Daily messages across our platform have grown to 186 billion a day, compared to 80 billion after the Brexit vote, and between 40 to 50 billion on a normal day and we have not had any outages.”
2020-04-24 15:50:16+00:00 Read the full story…
Weighted Interest Score: 3.2654, Raw Interest Score: 1.7544,
Positive Sentiment: 0.1132, Negative Sentiment 0.1321
NLP Pipeline Tutorial for Text Classification Modeling
A data science python tutorial on preprocessing your combined text and numeric data using sklearn’s FeatureUnion, Pipeline, and transformers
2020-04-27 13:54:45.583000+00:00 Read the full story…
Weighted Interest Score: 3.2206, Raw Interest Score: 1.6554,
Positive Sentiment: 0.0598, Negative Sentiment 0.0997
How ‘Bias Bounties’ May Put Ethics Principles Into Practice
In a paper published recently with the title ‘Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims’, a team of researchers from the Google Brain, Intel, OpenAI and other top labs from the US and Europe have launched a toolbox that will turn AI ethics principles into practice. The kit for organisations developing the AI models also includes the idea of rewarding developers for successfully detecting bias in AI, which is similar to security software getting rewarded with bug bounties. As per the authors of the paper, the bug bounty hunting community is still at its nascent stage, but can be useful in discovering biases.
The initial idea of bias bounties was suggested in the year 2018 by co-author JB Rubinovitz. The recently published paper suggests ten different approaches to turn AI ethics principles into practice. Taking a look at the recent efforts, more than 80 organisations have come up with different AI ethics principles. However, the authors of the paper firmly believe that the present set of norms and regulations is insufficient to develop a responsible AI. The team has also advised on ‘red-teaming’ to detect susceptibility, along with aligning with third-party auditing and government policies to create new regulations specific to market needs. The team also makes several other recommendations, such as:
- Create a centralized incident database by sharing incidents about AI as a community
- Maintain an audit trail during the development and deployment of AI systems for safety-critical applications
- Stringent scrutiny of commercial models along with alternative open sources for commercial AI systems
- Better support for privacy-centric techniques, such as federated learning, differential privacy, and encrypted computation
- Verify hardware performance claims made by researchers through increased government funding
2020-04-26 07:30:00+00:00 Read the full story…
Weighted Interest Score: 3.2099, Raw Interest Score: 1.2655,
Positive Sentiment: 0.2344, Negative Sentiment 0.5156
Why Sigmoid: A Probabilistic Perspective
This post aims to give an extensive yet intuitive set of reasons why the logistic sigmoid function is chosen for the linear classification model known as logistic regression, from a probabilistic perspective.
If you have taken any machine learning courses before, you must have come across logistic regression at some point. There is this sigmoid function that links the linear predictor to the final prediction. Depending on the course, this sigmoid function may be pulled out of thin air and introduced as the function that maps the number line to the desired range [0, 1]. There is an infinite number of functions that could do this mapping, wh…
2020-04-26 22:42:56.992000+00:00 Read the full story…
Weighted Interest Score: 3.1838, Raw Interest Score: 1.4957,
Positive Sentiment: 0.1184, Negative Sentiment 0.0968
Abu Dhabi Global Market taps regtech startup to automate licence applications
Regtech startup Nexus FrontierTech has joined forces with Abu Dhabi Global Markets (ADGM) to pilot an AI-based system to automate the licence application process for VC fund managers entering the emirate. Nexus and ADGM’s Financial Services Regulatory Authority (FSRA) have built a “RegBot”, which utilises natural language processing and machine learning to identify and immediately clarify information and risk gaps in licence applications.
A draft application form is automatically completed for the applicant. At the same time, an assessment report is generated for review by the FSRA. Nexus says the bot should help increase business efficiency for all stakeholders and reduce turnaround time while ensuring compliance with FSRA’s rules and regulations.
2020-04-27 00:01:00 Read the full story…
Weighted Interest Score: 3.1633, Raw Interest Score: 1.8018,
Positive Sentiment: 0.3003, Negative Sentiment 0.0000
Coming to Grips with COVID-19’s Data Quality Challenges
The COVID-19 pandemic is generating enormous amounts of data. Large amounts of data about infection rates, hospital admissions, and deaths per 100,000 are available with just a few button clicks. However, despite the large amount of data, we don’t necessarily have a better view of what’s actually happening on the ground, and the big COVID-19 data sets aren’t directly translating into better decision-making, data experts tell Datanami. As we’ve discussed many times in this publication, managing big data is hard. It’s not difficult to store petabytes worth of data (or even exabytes, which is fast becoming the delineation point for “big data”). But if you want to store that data in a manner that allows groups of individuals to access, analyze, and use that data for modeling purposes in a clean, repeatable, secure, and governed manner – well, that’s where things get interesting.
The COVID-19 pandemic is a once-in-a-lifetime event (hopefully) and organizations around the world are pulling out the stops to get in front of the disease. That has triggered a veritable tsunami of data collection and generation. Unfortunately, in the heat of the viral emergency, organizations haven’t put as much thought into important details about the data, ranging from how it was collected and transformed, what format it’s stored in, who has access to it, and how accurate it is. That’s to be expected during a time like this, but it doesn’t help the situation.
2020-04-21 00:00:00 Read the full story…
Weighted Interest Score: 3.1122, Raw Interest Score: 1.4770,
Positive Sentiment: 0.1524, Negative Sentiment 0.2696
IBM bolsters its software portfolio for fighting financial crime through Fenergo’s Customer Lifecycle Management
Fenergo, the leading provider of digital transformation, customer journey and client lifecycle management (CLM) solutions for financial institutions, and IBM (NYSE: IBM) today announced the signing of an original equipment manufacturing (OEM) agreement that will allow the companies to collaborate on solutions that can help clients address the multitude of financial risks they face.
The agreement enables IBM and Fenergo to create solutions that combine Fenergo’s CLM offering with IBM’s RegTech portfolio of anti-money laundering (AML) and know-your-client (KYC) solutions, all built with Watson. As a result, IBM will offer companies a complete AI application suite that is focused on risk and compliance and helps clients fend off financial criminals and meet their intensifying regulatory requirements for disclosure.
IBM plans to build on this work to assist clients in integrating AI-driven insights from its Financial Crimes Insights series of solutions into Fenergo’s CLM solution. Fenergo’s software is designed to help clients further reduce false positives in the AML and KYC solutions, reduce the costs of manual intervention, drive operational efficiencies, and improve overall customer experiences.
2020-04-21 00:00:00 Read the full story…
Weighted Interest Score: 2.8415, Raw Interest Score: 1.6229,
Positive Sentiment: 0.2441, Negative Sentiment 0.3417
Robo-advisers are facing their first major downturn
The expected global recession brought on by the Covid-19 pandemic will prove a stern test for robo-advisers that have attained a healthy degree of popularity in recent years. Beginner investors using robo-advice platforms are likely have faced a sudden and severe downturn in their portfolios after several years of consistent, if shallow, growth. “World equity markets had a strong year in 2019 and investors who either started their investment journey or held their nerve to ride out the volatility of markets at the end of 2018, benefited from this performance,” says Neil Alexander, Nutmeg’s new CEO.
The late-2018 volatility may begin to look like a picnic compared to what has already been witnessed in 2020 and is still likely to come. This was most recently brought home on March 20th when the Dow fell more than 500 points as the price of US crude slid to a record -$40.32 a barrel with lack of demand making it more costly to store oil than sell it. Such volatility may prove too hard to stomach for many beginners to investing who hold portfolios with robo-advice platforms like Nutmeg.
2020-04-22 08:30:00 Read the full story…
Weighted Interest Score: 2.7595, Raw Interest Score: 1.3591,
Positive Sentiment: 0.2336, Negative Sentiment 0.3822
Software tools for mining COVID-19 research studies go viral among scientists
One month after the debut of the COVID-19 Open Research Dataset, or CORD-19, the database of coronavirus-related research papers has doubled in size – and has given rise to more than a dozen software tools to channel the hundreds of studies that are being published every day about the pandemic.
In a roundup published on the ArXiv preprint server this week, researchers from Seattle’s Allen Institute for Artificial Intelligence, Microsoft Research and other partners in the project say CORD-19’s collection has risen from about 28,000 papers to more than 52,000. Every day, several hundred more papers are being published, in peer-reviewed journals and on preprint servers such as BioRxiv and MedRxiv.
CORD-19 aims to make sense of them all, using the Semantic Scholar academic search engine developed by the Allen Institute for AI, also known as AI2.
“We commit to providing regular updates to the dataset until an end to the crisis is foreseeable,” the project’s organizers say.
2020-04-23 20:11:14+00:00 Read the full story…
Weighted Interest Score: 2.5903, Raw Interest Score: 1.3075,
Positive Sentiment: 0.0849, Negative Sentiment 0.1698
Updating management styles: not just technology
“Evolve or become irrelevant” has been the mantra in the banking and finance sector for some time now. Updating legacy systems and transitioning to more agile, innovative technology has been a challenge at the forefront of most banks’ priorities within recent years.
Developing digital experiences for clients and keeping up with increasing customer expectations is essential. Banks must transition to integrated cloud systems and utilise new innovative technologies such as artificial intelligence, however, updating the technology itself isn’t enough, they must also recognise that moving away from traditional systems is both a technical and a human process.
2020-04-21 08:44:18 Read the full story…
Weighted Interest Score: 2.5144, Raw Interest Score: 1.3632,
Positive Sentiment: 0.3029, Negative Sentiment 0.1212
Cal State LA Introduces COVID-19 Dashboard, AI-Powered Mortality Risk Prediction Tool
COVID-19 is producing a deluge of data, from cases and hospitalizations to ventilator supplies and protein forms. Researchers at Cal State LA are leveraging that data, producing two tools: an interactive visual dashboard showing the predicted progression of COVID-19 in specific areas and an AI model that estimates mortality risk for COVID-19 patients.
The creators of the interactive dashboard, who work in Cal State LA’s College of Business and Economics, were inspired by the COVID-19 dashboard created by Johns Hopkins University early in the pandemic. Seeing room to simplify the dashboard and enable easier comparisons, they used Tableau to create a map that allowed users to filter to specific states and view forecasted cases and deaths. The dashboard, which is updated daily, uses data from the Johns Hopkins Center for Systems Science and Engineering.
2020-04-20 00:00:00 Read the full story…
Weighted Interest Score: 2.5058, Raw Interest Score: 1.1547,
Positive Sentiment: 0.1980, Negative Sentiment 0.0990
SAP Enhances the Support Experience with AI
SAP is making several update to its Schedule a Manager and Ask an Expert Peer services, among others, to better focus on the customer support experience and enable customer success.
Based on artificial intelligence AI and machine learning technologies, SAP has further developed existing functionalities with new, automated capabilities such as the Incident Solution Matching service and automatic translation.
“When it comes to customer support, we’ve seen great success in flipping the customer engagement model by leveraging AI and machine learning technologies across our product support functionalities and solutions,” said Andreas Heckmann, head of customer solution support and innovation and executive vice president, SAP. “To simplify and enhance the customer experience through our award-winning support channels, we’re making huge steps towards our goal of meeting customer’s needs by anticipating what they may need before it even occurs.”
2020-04-22 00:00:00 Read the full story…
Weighted Interest Score: 2.4555, Raw Interest Score: 1.6925,
Positive Sentiment: 0.6286, Negative Sentiment 0.1451
Understanding New Data-Driven Methodologies In Software Development
New data-driven methodologies in software development are showing up all the time. Here’s what to know about how to understand them.
The waterfall software development process is a methodology that can be used when the steps involved are straightforward and successive. In a waterfall model, developers move in a uni-directional manner and complete tasks one after another in a chain-like manner. They use unique machine learning tools and big data platforms to streamline the process as much as possible
2020-04-22 15:23:47+00:00 Read the full story…
Weighted Interest Score: 2.1703, Raw Interest Score: 1.4993,
Positive Sentiment: 0.1551, Negative Sentiment 0.0905
Nearmap surges as investors embrace cost cuts
Aerial imaging company Nearmap has laid out a series of cost saving measures designed to help the company hit cashflow breakeven by the end of the financial year, despite reporting no material impact from the COVID-19 downturn.
The company intends to maintain its investments in its 3D imaging, artificial intelligence and roof geometry products.
Nearmap’s AI product utilises image recognition technology to analyse images and provide users with details such as how many swimming pools are in a neighbourhood or how many solar panels a suburb has.
2020-04-21 00:00:00 Read the full story…
Weighted Interest Score: 2.1347, Raw Interest Score: 1.0309,
Positive Sentiment: 0.1473, Negative Sentiment 0.2577
Duos Technologies moves full steam ahead with its intelligent technologies
Specializes in rail train inspections with its proprietary Railcar Inspection Portal (RIP) technology. Has 14 patents or patents pending as it builds out its technology and services portfolio. Recently raised $9 million to support growth plans while uplisting to Nasdaq.
Duos also operates an artificial intelligence subsidiary, truevue360, or tv360 for short. The AI-based platform supports Duos’s underlying software platforms for its rail inspection portal system, vehicle undercarriage examiner and advanced logistics information system.
2020-04-24 00:00:00 Read the full story…
Weighted Interest Score: 2.0981, Raw Interest Score: 1.0920,
Positive Sentiment: 0.1324, Negative Sentiment 0.1655
Deep Learning Interview Questions
Looking to crack your next deep learning interview? You’ve come to the right place! We have put together a list of popular deep learning interview questions in this article. Each question comes with a comprehensive answer as well to guide you.
2020-04-20 03:42:32+00:00 Read the full story…
Weighted Interest Score: 2.0863, Raw Interest Score: 1.3583,
Positive Sentiment: 0.1297, Negative Sentiment 0.2352
Replicating Airbnb’s Amenity Detection with Detectron2
Ingredients: 1 x Detectron2, 38,188 x Open Images, 1 x GPU. Model training time: 18-hours. Human time: 127(ish)-hours.
Sometimes text is easier to read than images full of other images.
Collect data with downloadOI.py (a script for downloading certain images from the Open Images). Preprocess data with preprocessing.py (a custom script with functions for turning Open Images images and labels into Detectron2 style data inputs). Model data with Detectron2.
2020-04-27 04:44:13.667000+00:00 Read the full story…
Weighted Interest Score: 2.0662, Raw Interest Score: 0.8230,
Positive Sentiment: 0.1190, Negative Sentiment 0.0545
3 Data-Driven Elements Of Conversion Rate Optimization Strategies
Big data has played a very important role in conversion rate optimization. Smart marketers recognize that they need the latest big data tools to entice customers to make purchases.
Audrey Throne, an author with Big Data Analytics News, has shared some details about the benefits of big data in conversion rate optimization. She stated that there are seven ways it will impact ecommerce models.
2020-04-20 19:16:27+00:00 Read the full story…
Weighted Interest Score: 1.9995, Raw Interest Score: 1.1948,
Positive Sentiment: 0.3170, Negative Sentiment 0.0488
Microsoft technology chief explains how A.I. could someday help rural people get through a pandemic
- Microsoft CTO Kevin Scott wrote a new book with Greg Shaw, “Reprogramming the American Dream,” which talks about the use of AI and other technology to improve the lives of rural Americans.
- He talked to CNBC about how advances in AI could someday help people in small towns get through future pandemics.
- He also discussed universal basic income, saying that instead of paying people whose jobs are disrupted by AI, it would be better for society as a whole to use AI to lower costs on necessary items.
It’s easy to imagine artificial intelligence brightening up life in the big city. Self-driving taxis, drones and food-production machines could provide all sorts of conveniences to city dwellers, like shorter commutes and faster package and food delivery.
But technologists don’t spend as much time talking about how AI can help small towns.
Kevin Scott, Microsoft’s chief technology officer, is an exception. Scott grew up in Gladys, Virginia, a farming community in Campbell County. The county’s population of 54,885 decreased by 252 from the prior year, according to U.S. Census Bureau estimates. He talks about this part of the world in a new book co-written with Greg Shaw, “Reprogramming the American Dream.”
2020-04-26 00:00:00 Read the full story…
Weighted Interest Score: 1.9874, Raw Interest Score: 0.9040,
Positive Sentiment: 0.2712, Negative Sentiment 0.2260
In Pursuit of Citizen Data Scientists, Not Unicorns
As the CIO of a $26-billion manufacturer, Gary Cantrell had the will and the means to hire data scientists. He had plenty of data science problems to tackle at Jabil, which manufactures electronic devices on behalf of 300 clients at more than 100 facilities around the world. The problem was, there were no data scientists to be found.
“For the longest period, I was convinced there were only three data scientists in the world, and they just moved around from company to company, getting more and more money, because you couldn’t find these folks,” says Cantrell, who is also the senior vice president of IT at Jabil. “That’s what kicked us off on this program.”
For the past three years, Jabil (pronounced “JAY-bill”) has run almost 200 employees through a four-month course. The Jabil employees enter the course as engineers, analysts, or other business-oriented experts, and they exit as citizen data scientists, ready to tackle data science challenges for the 200,000-person firm.
2020-04-20 00:00:00 Read the full story…
Weighted Interest Score: 1.9661, Raw Interest Score: 1.1417,
Positive Sentiment: 0.1953, Negative Sentiment 0.2103
Key Data Trends And Forecasts In The Energy Sector
There are very important data trends and forecasts in the energy sector that are well worth noting. Here’s what to know about them.
With the Coronavirus pandemic, the world has been thrown into complete uncertainty. This goes for nearly everyone, but the energy sector is being greatly impacted by the virus. The industry, from renewables to coal, is being harmed by social distancing and the current situation around quarantine. According toa new study called Global Big Data Analytics in the Energy Sector Market, provides a comprehensive look at the industry. Large quantities of information are gathered from various sources within an organization. The value of data has become a primary focus for companies seeking an easy way to compromise.
2020-04-20 19:11:10+00:00 Read the full story…
Weighted Interest Score: 1.9589, Raw Interest Score: 1.1366,
Positive Sentiment: 0.3386, Negative Sentiment 0.2418
This news clip post is produced algorithmically based upon CloudQuant’s list of sites and focus items we find interesting. We used natural language processing (NLP) to determine an interest score, and to calculate the sentiment of the linked article using the Loughran and McDonald Sentiment Word Lists.
If you would like to add your blog or website to our search crawler, please email firstname.lastname@example.org. We welcome all contributors.
This news clip and any CloudQuant comment is for information and illustrative purposes only. It is not, and should not be regarded as investment advice or as a recommendation regarding a course of action. This information is provided with the understanding that CloudQuant is not acting in a fiduciary or advisory capacity under any contract with you, or any applicable law or regulation. You are responsible to make your own independent decision with respect to any course of action based on the content of this post.