BIG DATA PROCESSING USING HADOOP HDFS AND MAP-REDUCE FOR PUBLIC OPEN DATA (POD)

Najhan M. Ibrahim, Norbik B. Idris, Modh K.A. Hassan, Ciara Breathnach, Amir A.A. Hussin

Research output: Contribution to journalArticlepeer-review

Abstract

Information acquired and processed by government agencies and made available to the public via a web portal is known as Public Open Data. Many governments have created Public Open Data (POD) to promote government transparency by making data available to the public. Malaysia is no exception, having opened public-access data since 2014. Processing large amounts of data may become increasingly complex as the availability of data sources, data types, data volume, speed, and complexity grow. As a result, it's important to consolidate and find a new method for handling and optimizing massive amounts of POD. With the development of Big Data, computer system overflow has raised some problems. As a result, industry, researchers, and government have entered into a number of comprehensive research and development cooperative agreements. While the large volume of data and variety of data formats can make Big Data processing challenges, it also has the potential to produce innovative solutions with a wider range of applications. There's also a lot of debate over whether or not Big Data can supplant traditional data records. This study aims to investigate the potentials of big data processing utilizing Apache Hadoop and Java map/reduces in the future to discover valuable patterns, correlations, trend preferences, and other important information. With improved the future prediction and prevention mechanisms in place by authorities, the overall cost of government for public service would be reduced.

Original languageEnglish
Pages (from-to)1-11
Number of pages11
JournalJournal of Engineering Science and Technology
Volume16
Publication statusPublished - 2021

Keywords

  • Apache Hadoop
  • Big data processing
  • HDFS
  • Map-reduce
  • Public open data

Fingerprint

Dive into the research topics of 'BIG DATA PROCESSING USING HADOOP HDFS AND MAP-REDUCE FOR PUBLIC OPEN DATA (POD)'. Together they form a unique fingerprint.

Cite this