Business Data Extraction Using a Programming Language

Fabio Bragato Do Carmo; Vinícius Medeiros Magnani; Rafael Confetti Gatsios; Fabiano Guasti Lima

doi:10.4236/tel.2022.121011

Theoretical Economics Letters > Vol.12 No.1, February 2022

Business Data Extraction Using a Programming Language

Fabio Bragato Do Carmo, Vinícius Medeiros Magnani, Rafael Confetti Gatsios, Fabiano Guasti Lima
School of Economics, Business Administration and Accounting at Ribeirão Preto, University of São Paulo, São Paulo, Brazil.
DOI: 10.4236/tel.2022.121011 PDF HTML XML 226 Downloads 941 Views

Abstract

In the era of great informational quantity, the presence of technologies that assist in the extraction, transformation, and loading of data has become increasingly necessary. The term Big Data, usually used to describe this volume of information, requires the user to have knowledge of multiple tools such as Excel, VBA, SQL, Tableau, Python, Spark, AWS, and so on. In this context, the present work aims to study data extraction techniques using different methodologies. At the end of the work, a library of functions in the Python language will be made available that will deliver a compilation of stock price information available on the Yahoo Finance website as well as balance sheets from financial institutions released by Bacen. The main resource used will be Web Scraping, which is a method that aims to automate data collection via the web. Once the collection of functions has been structured, it will be made available for public enjoyment through the GitHub platform.

Keywords

Data Extraction, Big Data, Python, Web Scraping

Share and Cite:

Carmo, F. , Magnani, V. , Gatsios, R. and Lima, F. (2022) Business Data Extraction Using a Programming Language. Theoretical Economics Letters, 12, 195-215. doi: 10.4236/tel.2022.121011.

1. Introduction

In recent years, the impact caused by the advancement of technology in the financial market has become an essential phenomenon for us to understand the new behaviors and socioeconomic dynamics that accompany the business world. In this context, companies process a considerable volume of information that requires new data management tools to capture, store, and analyze (Laudon & Laudon, 2014).

The large set of stored data has been entitled “Big Data” by academics and market professionals. According to Chen, Mao, and Liu (2014), Big Data refers to a set of data that cannot be read, managed, and processed by traditional IT tools and software/hardware within a tolerable time.

In consensus in the literature, Big Data is defined by five “Vs”: velocity, volume, variety, veracity, and value (Naeem et al., 2022; Ferrara et al., 2014; Laender et al., 2002). Speed because companies need information to flow quickly, as close to real time as possible. Volume because of the large amount of data generated that needs to be stored. Variety due to the gathering of information from different sources. Veracity is related to the quality aspect of the information since it needs to be reliable. And finally, value, since it supports the decision making process of businesses, leading to the creation of new oportunities for companies (Chen, Mao, & Liu, 2014; Laudon & Laudon, 2014).

As a result of this technological expansion and the amount of interpreted information, then, programming languages, database managers, and cloud computing services have emerged as a foundation for the processing of incisive elements for decision making. One of the most important computational languages in the study of data is Python, created in 1991 by mathematician Guido Van Rossum. This language is object-oriented, interpreted, and written with the intention of easy learning and has several convenient modules to increase the productivity of the development of applications to users (Buriol, Marco, & Argenta, 2009).

This programming language includes several high-level structures such as lists, dictionaries, date/time, complexes, and others, in addition to a vast collection of ready-to-use modules and third-party frameworks that can be added. These circumstances provide sufficient resources to include it as one of the main components for analysis and decision making in data generation (Borges, 2014).

The Database Management System (DBMS), in turn, is responsible for the interface of the manipulation and organization of the collected information. Among such systems, a few common ones are MySQL, SQL Server, and Oracle. To use them, it is necessary to have knowledge of Structured Query Language (SQL). According to Miyagusuku (2008), the SQL language involves commands not only of data query but also of manipulation and definition of rules and operators that preserve the integrity and consistency of data in addition to allowing the implementation of procedures (stores procedures), functions (functions) and triggers (triggers), and applications developed in other languages. DBMSs are excellent informational processors and structures, but when it comes to massive amounts of data, activities need to be migrated to cloud computing; AWS (Amazon Web Services) is one example of a company that includes this service.

The term Big Data, commonly used to describe a vast mass of processed information, has also supported the emergence of new programming languages. Spark, for example, was built with a focus on the dimensionality of use, allowing the user, in the same application, to integrate different languages such as SQL. Because of these factors, today, there has been a growth in the number of professionals using Python, DBMSs, and cloud computing services to analyze, for example, the earnings of companies listed on the São Paulo Stock Exchange (B3).

Investors use many strategies to study the performance of a company, but it is essential that the first step before starting assessments is to build a database with all the information of the organization. Thus, the main objective of the present work is to structure the use of these new technologies for obtaining bulk data. Three distinct methods will be used to demonstrate the daily routine of a data scientist in the job market, the first being the search for prices of stock market negotiations and data analysis, the second the search for “.zip” format files extracted directly from the HyperText Markup Language (HTML) code of the site server page, and the third, the creation of “Comma-Separated Values” (CSV) format files with the objective of manipulating and organizing the data for further analysis. The codes are applied to information from DBMSs in Brazil; however the method and coding can be adapted to other DBMSs in other countries. Therefore, the novelty of this work is to highlight and disseminate routines for collecting and building a database applied to internet data in html, data in .zip format and in csv format.

2. Research Methodology

The work has a descriptive approach, with the main objective of providing evidence for the methodology used by a data scientist, from the creation of an account in a cloud platform for data storage to the construction and application of the code for data interpretation.

The objects of study are the historical prices of financial institutions traded on B3, as well as their income statements, during the periods of 2017, 2018, and 2019. The price information will be extracted directly from the Application Programming Interface (API) provided by the Yahoo Finance website, and the income statements will be collected directly from the Brazilian Central Bank website. To structure the information source, Amazon S3 (Simple Storage Service) will be used, which will store the files of the data that will be architected in CSV format. The programming language used will be Python in the Jupyter development environment, since there are function libraries that facilitate the extraction of the information to be collected.

Among the libraries available in the Jupyter development environment, the following will be used:

• Pandas_datareader: This includes functions that allow the user to extract historical stock price information from Yahoo Finance, one of the largest financial news sites in the world. A plausible alternative to obtaining the monetary reports of traded stocks is the Quandl library, a leading source of financial, economic, and alternative data sets for investment professionals.

• NumPy: This is mainly used to perform calculations on multidimensional arrangements. In addition, it provides a large set of functions and operations that help programmers easily perform numerical calculations.

• Pandas: This is the main tool for organizing data. It provides the simplicity of being able to clean, delete, or filter information according to a criterion as well as store tables and graphs in different file formats.

• Requests: This allows HTTP (Hypertext Transfer Protocol) requests to be made to the specified server of the webpage. It is the bridge for performing the extraction of data.

• Zipfile: This provides tools to create, read, write, add, and list files in the “.zip” format.

• Io: This is commonly used to speed up queries made via the web.

In the development of the methodology, all the cited collections will be used with the intention of contemplating the largest amount of possible resources allowed by Python in addition to having enough subisidies for the reader to learn and apply the obtained knowledge on his/her own having as focus the supply of enough subsidies to the reader to learn and apply by himself the obtained knowledge.

3. Development of Working Model

3.1. Creating an AWS Account

The AWS platform is the world’s most widely adopted and comprehensive cloud platform, offering more than 175 comprehensive data center services worldwide. There are different uses for AWS. One of its main services is EMR, which facilitates the execution of Big Data frameworks such as Hadoop and Apache Spark. In the third data extraction method, only the function of S3 for storing CSV files in the cloud will be used. Therefore, to execute the codes, it is necessary to create an account on AWS.

Figure 1 shows the AWS platform account creation table.

Figure 1. AWS account creation screen. Source: Own authorship.

3.2. Jupyter and Python Installation

Jupyter Notebook is an Integrated Development Environment (IDE). Briefly, it is the development environment for the applications made by a programmer. It is also one of the most used tools in the elaboration of Python files because it allows users to unite code and text in a single document. The installer provided by the Anaconda project on its official website (Anaconda, 2020) should be used to install the software. Figure 2 shows the images demonstrating the steps to follow after downloading the file:

Figure 2. Steps to download file. Source: Own authorship.

3.3. Creating the First Python File

After installation, Jupyter Notebook should be started. A local server will run (by default, local output 8888), and, once loaded, a new tab in the browser will open, making it possible to see the computer Dashboard, Figure 3.

A list of files and subdirectories present on the computer is displayed. The user should create or select a folder to receive the new files. Afterward, the user should select the “New” option and choose “Python 3, Figure 4”.

A new screen will open. This is where all the applications will be developed, Figure 5.

Figure 5 shows that the Jupyter Notebook is composed of the following elements:

1) File name: name of the file, which will also be displayed at the top of the screen.

Figure 3. Jupyter dashboard. Source: Own authorship.

Figure 4. Jupyter directories. Source: Own authorship.

Figure 5. Application development screen. Source: Own authorship.

2) Menu bar: different options that can be used to manipulate the operation of the notebook.

3) Toolbar: quick way to perform the most frequently used operations.

4) Code cell: allows the user you to edit or write new code.

3.4. Installing the Libraries

A library is a collection of modules accessible in a Python program that allows the user to simplify the programming process and remove the need to rewrite an already used function such as a calculation that extracts the square root of a number. We can call that function continuously just by modifying (or not modifying) the pre-existing parameters. Because of this, it is necessary to install open source libraries (that anyone has the right to study, modify, or distribute) that will help to create the data search methods. The execution is as follows, Figure 6.

3.5. Methods

3.5.1. Yahoo Finance

The first method used will be with the support of the library “pandas_datareader” to search the trading prices of shares on stock exchanges. The consulted data are extracted from the Yahoo Finance website and has a fifteen-minute lag time. The function created for the library called “data_actions_yahoo” was defined based on four parameters:

1st: Tickers, in which it is necessary to define the stock codes to be searched. By convention, all assets traded on B3 have the ending “.SA” (Joint Stock Company), as in “MGLU3.SA”, “AZUL4.SA”, “HBOR3.SA”, and so on.

2nd: Start date, which will serve to reference the beginning of the consultation period. Written in a “year-month-day” pattern (“2020-01-01”).

3rd: End date, which will serve to reference the end of the consultation period. Written in a “year-month-day” pattern (“2020-06-01”).

4th: Type, which will reference the desired information among the possible types. “G”—All available information; “A”—Adjusted closing prices; “C”—Closing prices; “H”—Maximum prices; “L”—Minimum prices; “O”—Opening prices; “V”—Trading volume. Figure 7 shows the treatment of the function “dados_acoes_yahoo, Figure 7”.

Figure 6. Installing the libraries. Source: Own authorship.

Figure 7. Handling of the “data_actions_yahoo” function. Source: Own authorship.

The second function, “plot_graphs_closing,” aims to create a comparative chart of the stock prices evolution based on the parameters “tickers”, “start_date”, and “end_date”. It is shown below Figure 8, demonstrating the function of graph plotting.

The third function, “simple_returns” serves to create a table with the simple stock returns in the specified period from the calculation and is evidenced in Figure 9. In Equation (1), the stock return calculation can be observed.

$(\frac{Share pricein Tn}{Share Pricein Tn - 1}) - 1$ (1)

The parameters are “tickers”, “start_date”, and “end_date”, Figure 9.

The last function, “retornos_log”, has as objective of calculating the logarithmic returns of stocks in the specified period from the calculation:

$\frac{\ln ({Share Price}_{t})}{\ln ({Share Price}_{t - 1})} - 1$ (2)

The parameters are “tickers”, “start_date”, and “end_date”, Figure 10.

Finally, many other functions can be created based on each user’s need. However, the main goal of programming is to avoid repeating work and make it easier to write code.

Figure 8. Handling of the “plot_graphs_closure” function. Source: Own authorship.

Figure 9. Treatment of the “simple_returns” function. Source: Own authorship.

Figure 10. Treatment of the function “retornos_log.” Source: Own authorship.

3.5.2. ZIP Files

The method of using “.zip” files is one of the most frequently utilized in data search. The purpose is to search the information of a site without installing any files. The information is extracted from the HTML code of the page itself or from the HTTP connection with the site server. The following example consolidates information from the balance sheets of financial institutions published by Bacen.

First of all, it is necessary to check the repository format, which is the “.zip” extension with CSV files. In addition, when you open the file in Excel, you can check the following structure, Figure 11.

As shown above, the data starts on the fourth line. This information will be important in view of the need to structure the data when writing the function. The code is as follows, Figure 12.

Analyzing the code line by line, we have the following steps:

• Line 1: Definition of the function that accepts two parameters, “start_harvest” and “end_harvest”, both in the “year-month” format, for example, “202001”.

• Line 3: Definition of the function to find the number of months between the initial and final harvests.

• Line 4: Creation of a list containing the crops to be searched.

• Line 5: Definition of a variable “i” with a value of zero that will serve as an auxiliary.

• Lines 7 to 17: A loop to add the crops chosen by the user.

• Line 19: Creation of a list that will contain the data from the CSV files to be extracted.

Figure 11. Excel structure. Source: Own authorship.

Figure 12. Handling the “fetch_balancets” function. Source: Own authorship.

• Lines 21 to 27: The loop to capture the file information is built.

• Line 22: Definition of the URL of the site where we will collect the information from the balance sheets.

• Line 23: Use of the method “requests.get” to make the connection with the site.

• Line 24: Use of a compilation of methods. Analyzing from the inner layer outward, we have the use of “r.content”, which has the function of collecting all the information contained in the connection made in line 23. It also uses the “io.BytesIo” to increase the search speed and, lastly, the “zipfile.Zipfile” to handle the “.zip” file.

• Line 25: A variable is set to capture the unzipped “.zip” file.

• Line 26: A “DataFrame” is created and the method “pd.read_csv” is used to read the CSV file contained in the decompressed archive. It is important to observe the parameters used in the function.

• “sep = ‘;’”: CSV file with separator “;”.

• “skiprows = 3”: Skips three rows as indicated in the Excel figure above to start reading the information from the row that has the data pertinent to the trial balances.

• “engine = ‘python’”: An optional parameter that defines the parsing engine. The C engine is faster, while the Python engine is currently more resourceful.

• Line 27: Addition of the extracted data to the list of data created.

• Line 29: Concatenation of all extracted data into a single table.

• Line 30: Return of the extracted data.

3.5.3. CSV in AWS S3

The CSV file method in AWS S3 is an alternative way to manipulate and organize data. As an example, we will use the financial statement data provided by the Central Bank on the IFDATA website (https://www3.bcb.gov.br/ifdata/). First, the user must download the desired files. In the institution type, the user must select the option “Financial Conglomerates and Independent Institutions” in addition to choosing one of the three possible reports (assets, liabilities, or income statement), Figure 13.

Once the files have been obtained, the user logs in to the AWS website and selects the “AWS Management Console” option to log into the account, Figure 14.

Upon logging in, the user must click on services in the top menu and select “S3” in the storage group, Figure 15.

This is the screen where the user can create several directories (“Buckets”) that must have unique names. To do so, press “Create Bucket” and configure it the way you prefer, Figure 16.

Figure 13. IFDATA data selection screen. Source: Own authorship.

Figure 14. AWS login screen. Source: Own authorship.

Figure 15. AWS selection menu. Source: Own authorship.

Figure 16. Creating the bucket. Source: Own authorship.

After creating the Bucket, creating a folder for each report in order to maintain the organization of the files is suggested, Figure 17.

The user must upload the files that he/she downloaded in the created folder, and it is essential that, for the functioning of the library, the file is named with the standard “year-quarter”, for example, “202001” referring to the first quarter of 2020, Figure 18.

Finally, the user must select all the files that were uploaded to the folder and in the “Actions” option, select “Make Public”, Figure 19.

After finishing the S3 step, the library is ready for operation. The image below shows its construction part by part and contains more than one function for use. In order to facilitate understanding, the code will be separated into four parts:

Function “buscar_relatorio”: This function’s aim is to structure a report of a specific period in a grid, Figure 20.

Part 1: construction of the asset report.

• Line 1: Define the function “buscar_relatorio”, which has the parameters name_bucket (ex: “ifdatatcc”), account (ex: “assets”, “liabilities”, or “DRE”), year (ex: 2020) and quarter (ex: 1, 2, 3, or 4).

• Line 2: The “try” method is used for error handling, as will be shown at the end of the code.

• Line 4: A conditional is created to check if one of the reports was selected.

• Line 7: This shows a link where the CSV file is stored in S3.

Figure 17. Bucket structure. Source: Own authorship.

Figure 18. Structured files in AWS. Source: Own authorship.

Figure 19. Making the archives public. Source: Own authorship.

Figure 20. Handling the assets of the “fetch_report” function. Source: Own authorship.

• Line 13: This creates a conditional for when the “Asset” report is selected.

• Lines 16 to 53: These lines contain data processing, since it is necessary to process the information in order to use it.

• Line 56: This line returns the processed data to the function user.

Part 2: construction of the liability report, Figure 21.

The same steps are performed as in the previous method, but the data are treated according to their peculiarities. The final return is given in line 104.

Part 3: construction of the DRE (Statement of Income for the Year) report, Figure 22.

This has the same logic, but is also treated according to its differences. The final return is on line 161.

Part 4: final procedure of the code for handling exceptions, Figure 23.

• Line 163: In case the name of the selected report is not found, a phrase will be returned to the user in order to correct the value passed as parameter.

Figure 21. Handling the liabilities of the “fetch_report” function. Source: Own authorship.

Figure 22. Handling the DRE of the “fetch_report” function. Source: Own authorship.

• Lines 165 to 170: Error handling related to searching for non-existent periods and when the required library for operation is not imported.

Function “buscar_relatorios”: Its aim is to consolidate in a single table more than one period specified by the function “buscar_relatorio.”, Figure 24.

To build this code, the function previously built was used to help in the consolidation of dates.

• Line 1: The function “buscar_relatorios” is defined; it the parameters “name_ bucket” (ex: “ifdatatcc”), account (ex: “assets”, “liabilities”, or “DRE”), “initial_year” (ex: 2019), “initial_quarter” (ex: 1, 2, 3, or 4), “final_year” (ex: 2020), and “final_quarter” (ex: 1, 2, 3, or 4).

• Line 2: A calculation is created to find the number of reports to be extracted.

• Lines 3 to 6: A loop is needed to find the position of “start_date” in the date list.

• Lines 8 to 14: A loop runs that adds the number of reports calculated in line 2 from the start date. At the end of the procedure, the data are merged into a single table.

• Line 16: Returns the final information to the user.

Function “periodos_disponible”: It has as objective to show the user which is the available report periods. It does not have parameters, Figure 25.

Figure 23. Error handling of the “fetch_report” function. Source: Own authorship.

Figure 24. Handling the “fetch_reports” function. Source: Own authorship.

Figure 25. Handling the “available_periods” function. Source: Own authorship.

4. Results and Findings

After the construction of the three methods, the information was consolidated into a single file from all the functions created in order to establish a vast repertoire of utilities to the users of the elaborated library. The result of each function is shown below, Figure 26.

• “data_actions_yahoo”.

• “plot_graphic_closures”, Figure 27.

• “returns_simple”, Figure 28.

• “returns_log”, Figure 29.

• “fetch_balancets”, Figure 30.

• “fetch_report”, Figure 31.

• “fetch_reports”, Figure 32.

• “periods_available”, Figure 33.

Figure 26. Return of the function “dados_acoes_yahoo”. Source: Own authorship.

Figure 27. Return of the function “plot_graphic_closures”. Source: Own authorship.

Figure 28. Return from the “simple_returns” function. Source: Own authorship.

Figure 29. Return of the function “retornos_log”. Source: Own authorship.

Figure 30. Return of the function “fetch_balancets”. Source: Own authorship.

Figure 31. Return of the function “buscar_relatorio”. Source: Own authorship.

Figure 32. Return of the function “buscar_relatorios”. Source: Own authorship.

Figure 33. Return of the function “periods_available”. Source: Own authorship.

5. Conclusion

In summary, it is possible to conclude that data extraction tools are currently of extreme importance, in view of the large informational volume processed and the different sources of information available to users. It is interesting to point out that the profession of data analyst has become increasingly essential for companies (Lima, 2018). There is applicability for the techniques of capturing information in several fields, as in the area of health, in which evidence can be collected for research proof (Bradley, Curry, & Devers, 2007), as in the Business and Competitive Intelligence, where a company can probe the Web to acquire and analyze information about the activity of its competitors (Baumgartner et al., 2005; Chen et al., 2002), and as in crawling os Social Web platforms, to map consumer desires and understand, model and predict human behavior (Gkotsis et al., 2013; Catanese et al., 2011).

This work explored the Web Scraping technique, which is a main method in the context of the digital era. The extraction can be made from simple texts, as well as, with the use of more advanced techniques, from audio and even video material using the aid of artificial intelligence. Thus, the main contribution of the work was to structure the use of three distinct methods to structure a data collection process, the first being the search for stock trading prices on the stock exchange and graphically analyzing the data, the second the search for files in “.zip” format extracted directly from the HyperText Markup Language (HTML) code of the website’s server page, and the third, the creation of files in the “Comma-Separated Values” (CSV) format in order to manipulate and organize data for further analysis. The codes are applied to information from SGBDs in Brazil; however, the method and coding can be adapted to other DBMSs in other countries. Therefore, the code of the work used is available for public enjoyment on the GitHub platform (Github, 2020). The results show that the construction of programs for data extraction can be a great ally in the efforts of information users. Once created, the time saving of such tools for the structuring of reports that bring value generation becomes obvious.

The main limitation of this study is related to the need to check whether downloading data from the site with scripts is allowed and does not violate local data protection laws. Finally, regarding future work for the evolution of the solutions developed in this project, we would like to mention the study of other techniques using resources such as the AWS EMR to create a virtual machine capable of processing large-scale files as well as the demonstration of the use of machine learning and deep learning techniques.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Anaconda (2020) Individual Edition. https://www.anaconda.com/products/individual
[2]	Baumgartner, R., Frohlich, O., Gottlob, G., Harz, P., Herzog, M., & Lehmann, P. (2005). Web Data Extraction for Business Intelligence: The Lixto Approach. Gesellschaft für Informatik eV.
[3]	Borges, L. E. (2014). Python para desenvolvedores: Aborda Python 3.3 (p. 14). Novatec Editora. https://books.google.com.br/books?hl=pt-BR&lr=lang_pt&id=eZmtBAAAQBAJ&oi=fnd&pg= PA14&dq=python&ots=VDSosqIkmu&sig=TZ0MbKn058lnRJ9zgZrNLmoOFh4#v=onepage&q=python&f=false
[4]	Bradley, E. H., Curry, L. A., & Devers, K. J. (2007). Qualitative Data Analysis for Health Services Research: Developing Taxonomy, Themes, and Theory. Health Services Research, 42, 1758-1772. https://doi.org/10.1111/j.1475-6773.2006.00684.x
[5]	Buriol, T. M., Marco, B., & Argenta, M. A. (2009). Acelerando o desenvolvimento eo processamento de análises numéricas computacionais utilizando python e cuda. https://www.researchgate.net/profile/Marco_Argenta/publication/228683446_ACELERANDO_ O_DESENVOLVIMENTO_EO_PROCESSAMENTO_DE_ANALISES_NUMERICAS_COMPUTACIONAIS_UTILIZANDO_ PYTHON_E_CUDA/links/5630d6c908ae0530378cdf06.pdf
[6]	Catanese, S. A., De Meo, P., Ferrara, E., Fiumara, G., & Provetti, A. (2011, May). Crawling Facebook for Social Network Analysis Purposes. In Proceedings of the International Conference on Web Intelligence, Mining and Semantics (pp. 1-8). https://doi.org/10.1145/1988688.1988749
[7]	Chen, H., Chau, M., & Zeng, D. (2002). CI Spider: A Tool for Competitive Intelligence on the Web. Decision Support Systems, 34, 1-17. https://doi.org/10.1016/S0167-9236(02)00002-7
[8]	Chen, M., Mao, S., & Liu, Y. (2014). Big Data: A Survey. Mobile Networks and Applications, 19, 171-209. https://doi.org/10.1007/s11036-013-0489-0
[9]	Ferrara, E., De Meo, P., Fiumara, G., & Baumgartner, R. (2014). Web Data Extraction, Applications and Techniques: A Survey. Knowledge-Based Systems, 70, 301-323. https://doi.org/10.1016/j.knosys.2014.07.007
[10]	Github (2020) Fabiobragato/Finances. https://github.com/fabiobragato/finances.git
[11]	Gkotsis, G., Stepanyan, K., Cristea, A. I., & Joy, M. (2013, July). Self-Supervised Automated Wrapper Generation for Weblog Data Extraction. In British National Conference on Databases (pp. 292-302). Springer. https://doi.org/10.1007/978-3-642-39467-6_26
[12]	Laender, A. H., Ribeiro-Neto, B. A., Da Silva, A. S., & Teixeira, J. S. (2002). A Brief Survey of Web Data Extraction Tools. ACM SIGMOD Record, 31, 84-93. https://doi.org/10.1145/565117.565137
[13]	Laudon, K. C., & Laudon, J. P. (2014). Management Information Systems: Managing the Digital Firm (13th ed., p. 37). Prentice Hall. http://dinus.ac.id/repository/docs/ajar/Kenneth_C.Laudon,Jane_P_.Laudon_-_ Management_Information_Sysrem_13th_Edition_.pdf
[14]	Lima, F. G. (2018). Análise de Riscos. Ed. Atlas.
[15]	Miyagusku, R. (2008). Curso Prático de SQL. Guia de Referência Completo Para Usar a linguagem SQL nos Bancos de Dados: MS SQL Server, Oracle, PostgreSQL, MySQL.
[16]	Naeem, M., Jamal, T., Diaz-Martinez, J., Butt, S. A., Montesano, N., Tariq, M. I., De-La-Hoz-Valdiris, E. et al. (2022). Trends and Future Perspective Challenges in Big Data. In Advances in Intelligent Data Analysis and Applications (pp. 309-325). Springer. https://doi.org/10.1007/978-981-16-5036-9_30

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies