Machine Learning 2019: Improved and Intelligent Hybrid Machine Learning Models For Fast Big Data Processing - Andronicus Ayobami Akinyelu - University of the Free State - South Africa

The volume of data generated daily through the use of the Google search engine, Twitter, Instagram, Facebook, etc. is becoming overwhelming. Unfortunately, traditional data analytics techniques are fast losing their capabilities and efficiencies handling such high volume. This problem has motivated several researchers to design efficient and fast techniques for big data processing. Some of these techniques aim at reducing the volume of input dataset to speed up the computation time of big dataset processing. Interestingly, fast and accurate big dataset processing techniques can be developed using Machine Learning (ML) algorithms and Nature-inspired instance selection techniques. This paper presents a hybrid Nature-Inspired ML-based method for improving the computation speed of big dataset processing. In the proposed method, we combine an intelligent instance selection algorithm (inspired by edge selection in Ant Colony Optimization algorithm) with four ML algorithms: Naïve Bayes, random forest, BayesNet, and artificial neural network. The proposed method is evaluated on large or medium-scale datasets, and the results reveal that it notably improves the computation speed of big data classification without significantly affecting the predictive accuracy. Besides, the results show that the proposed technique produces improved data reduction capacity and compares well with existing data reduction techniques.

Fast improvements in equipment, programming, and correspondence advances have encouraged the development of Internet-associated tangible gadgets that give perceptions and information estimations from the physical world. By 2020, it is evaluated that the complete number of Internet-associated gadgets being utilized will be somewhere in the range of 25 and 50 billion. As these numbers develop and advancements become increasingly full grown, the volume of information being distributed will increment. The innovation of Internet-associated gadgets, alluded to as Internet of Things (IoT), keeps on broadening the current Internet by giving availability and collaborations between the physical and digital universes. Notwithstanding an expanded volume, the IoT creates huge information portrayed by its speed as far as time and area reliance, with an assortment of different modalities and shifting information quality. Canny preparing and examination of this large information are the way to creating keen IoT applications. This article surveys the different AI techniques that manage the difficulties introduced by IoT information by considering savvy urban areas as the principle use case. The key commitment of this examination is the introduction of a scientific classification of AI calculations clarifying how various procedures are applied to the information so as to remove more significant level data. The potential and difficulties of AI for IoT information examination will likewise be talked about. An utilization instance of applying a Support Vector Machine (SVM) to Aarhus brilliant city traffic information is introduced for an increasingly point by point investigation. Rising innovations lately and significant improvements to Internet conventions and registering frameworks, have made correspondence between various gadgets simpler than any time in recent memory. As indicated by different gauges, around 25–50 billion gadgets are relied upon to be associated with the Internet by 2020. This has offered ascend to the recently created idea of Internet of Things (IoT). IoT is a blend of inserted innovations including wired and remote interchanges, sensor and actuator gadgets, and the physical items associated with the Internet One of the long-standing destinations of figuring is to disentangle and improve human exercises and encounters (e.g., see the dreams related with "The Computer for the 21st Century"] or "Registering for Human Experience" . IoT expects information to either speak to better administrations to clients or upgrade the IoT system execution to achieve this shrewdly. As such, frameworks ought to have the option to get to crude information from various assets over the system and dissect this data so as to separate knowledge.Since IoT will be among the most significan t wellsprings of new information, information science will give an impressive commitment to making IoT applications increasingly keen. Information science is the blend of various logical fields that utilizes information mining, AI, and different procedures to discover designs and new experiences from information. These procedures incorporate an expansive scope of calculations relevant in various areas. The way toward applying information examination techniques to specific zones includes characterizing information types, for example, volume, assortment, and speed; information models, for example, neural systems, characterization, and bunching strategies, and applying productive calculations that coordinate with the information qualities. By following our surveys, coming up next is reasoned: First, since information is produced from various sources with explicit information types, it is imperative to embrace or create calculations that can deal with the information qualities. Second, the incredible number of assets that produce information progressively are not without the issue of scale and speed. At long last, finding the best information model that fits the information is one of the most significant issues for design acknowledgment and for better examination of IoT information. These issues have opened an immense number of chances in extending new turns of events. Large information is characterized as high-volume, high-speed, and high assortment information that requests financially savvy, creative types of data preparing that empower upgraded understanding, dynamic, and procedure robotization . As for the difficulties presented by huge information, it is important to present another idea named brilliant information, which signifies: "acknowledging profitability, productivity, and viability gains by utilizing semantics to change crude information into Smart Data" . A later meaning of this idea is: "Savvy Data offers some incentive from tackling the difficulties presented by volume, speed, assortment, and veracity of Big Data, and thusly giving significant data and improving dynamic.". At long last, keen information can go about as a decent agent for IoT information.

Clearly we are living in an information downpour time, prove by the marvel that huge measure of information have been by and large ceaselessly created at extraordinary and regularly expanding scales. Huge scope informational collections are gathered and concentrated in various areas, from building sciences to interpersonal organizations, business, biomolecular exploration, and security. Especially, advanced information, created from an assortment of computerized gadgets, are developing at surprising rates. As indicated by , in 2011, advanced data has grown multiple times in volume in only 5 years and its sum on the planet will arrive at 35 trillion gigabytes by 2020. Thusly, the expression "Large Data" was begat to catch the significant importance of this information blast pattern. Among these reviews, a complete diagram of the enormous information from three distinct edges, i.e., development, rivalry, and profitability, was introduced by the McKinsey Global Institute (MGI). Other than portraying the principal strategies and innovations of enormous information, various later examinations have explored large information under specific setting. For instance, gave a short survey of the highlights of large information from Internet of Things (IoT). A few creators likewise broke down the new attributes of enormous information in remote systems, e.g., regarding 5G.


Author(s): Andronicus Ayobami Akinyelu 

Abstract | PDF

Share This Article