Title
Predlog arhitekture sistema visokih performansi za generalnu obradu podataka na klasterima za podatke velikog obima
Creator
Štufi, Martin T., 1973-
CONOR:
103887881
Copyright date
2022
Object Links
Select license
Autorstvo-Nekomercijalno-Bez prerade 3.0 Srbija (CC BY-NC-ND 3.0)
License description
Dozvoljavate samo preuzimanje i distribuciju dela, ako/dok se pravilno naznačava ime autora, bez ikakvih promena dela i bez prava komercijalnog korišćenja dela. Ova licenca je najstroža CC licenca. Osnovni opis Licence: http://creativecommons.org/licenses/by-nc-nd/3.0/rs/deed.sr_LATN. Sadržaj ugovora u celini: http://creativecommons.org/licenses/by-nc-nd/3.0/rs/legalcode.sr-Latn
Language
Serbian
Cobiss-ID
Theses Type
Doktorska disertacija
description
Datum odbrane: 10.07.2023.
Other responsibilities
član komisije
Stojanović, Dragan
član komisije
Rančić, Dejan
član komisije
Stanimirović, Aleksandar
član komisije
Milosavljević, Branko
Academic Expertise
Prirodno-matematičke nauke
Academic Title
-
University
Univerzitet u Nišu
Faculty
Elektronski fakultet
Group
Katedra za računarstvo
Alternative title
An architecture proposal for high-performance and general data processing system on big data clusters
Publisher
[M. T. Štufi]
Format
122 lista
description
Bibliografija: listovi 95-100.
description
Distributed (cluster) data processing systems
Abstract (en)
In recent years, the application and widespread adoption
of Big Data, Internet of Things (IoT), Cloud technologies
have increased the use of large-scale data processing systems.
These technologies increased significantly and exponentially
with the heterogeneous data generated (structured,
unstructured, and semi-structured). The processing and
analysis of a tremendous amount of data is cumbersome and
is gradually moving from the classic "batch" processing -
extraction, transformation, loading (ETL) techniques to realtime
processing. For example, in the domain of the
automobile industry, healthcare, but also in other disciplines.
Tracking, data processing, environmental management, timeseries
data, and historical data set are crucial to forecasting
models not only in these domains.
This doctoral dissertation is about the design of a
general architecture for processing a large amount of data.
The architecture as such enables efficient acquisition of data,
their optimal placement, processing of large amounts of data,
use of various algorithms for drawing conclusions as well as
for displaying data. The doctoral dissertation shows the
complete process of modeling and designing architecture, the
selection of appropriate software components for its
realization. The presented platform met very demanding
parameters for meeting the system's performance, including
the standard for decision support of the Transaction
Processing Council (TPC-H) by following the European
Union (EU) legislation and the Czech Republic. Currently, the
presented proof of concept (PoC) that has been upgraded to
the production environment has united isolated parts of the
Czech Republic's healthcare. The reported PoC Big Data
Analytics platform, artefacts and concepts can be transferred
to health systems in other countries interested in developing
or upgrading their national health infrastructure in a costeffective,
secure, scalable, and high-performance way.
Authors Key words
Klaster, Big Data, Stream, Vertica, NoSQL, obrada
podataka u realnom vremenu, stream podaci
Authors Key words
Big Data, Big Data Analytics, TPC-H, NoSQL Database
cluster, Real time BDA
Classification
004.6/.65:004.275(043.3)
Subject
T 120
Type
Tekst
Abstract (en)
In recent years, the application and widespread adoption
of Big Data, Internet of Things (IoT), Cloud technologies
have increased the use of large-scale data processing systems.
These technologies increased significantly and exponentially
with the heterogeneous data generated (structured,
unstructured, and semi-structured). The processing and
analysis of a tremendous amount of data is cumbersome and
is gradually moving from the classic "batch" processing -
extraction, transformation, loading (ETL) techniques to realtime
processing. For example, in the domain of the
automobile industry, healthcare, but also in other disciplines.
Tracking, data processing, environmental management, timeseries
data, and historical data set are crucial to forecasting
models not only in these domains.
This doctoral dissertation is about the design of a
general architecture for processing a large amount of data.
The architecture as such enables efficient acquisition of data,
their optimal placement, processing of large amounts of data,
use of various algorithms for drawing conclusions as well as
for displaying data. The doctoral dissertation shows the
complete process of modeling and designing architecture, the
selection of appropriate software components for its
realization. The presented platform met very demanding
parameters for meeting the system's performance, including
the standard for decision support of the Transaction
Processing Council (TPC-H) by following the European
Union (EU) legislation and the Czech Republic. Currently, the
presented proof of concept (PoC) that has been upgraded to
the production environment has united isolated parts of the
Czech Republic's healthcare. The reported PoC Big Data
Analytics platform, artefacts and concepts can be transferred
to health systems in other countries interested in developing
or upgrading their national health infrastructure in a costeffective,
secure, scalable, and high-performance way.
“Data exchange” service offers individual users metadata transfer in several different formats. Citation formats are offered for transfers in texts as for the transfer into internet pages. Citation formats include permanent links that guarantee access to cited sources. For use are commonly structured metadata schemes : Dublin Core xml and ETUB-MS xml, local adaptation of international ETD-MS scheme intended for use in academic documents.