<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ArticleSet PUBLIC "-//NLM//DTD PubMed 2.7//EN" "https://dtd.nlm.nih.gov/ncbi/pubmed/in/PubMed.dtd">
<ArticleSet>
<Article>
<Journal>
				<PublisherName>Univrsity Of Tehran Press</PublisherName>
				<JournalTitle>Journal of Information Technology Management</JournalTitle>
				<Issn>2980-7972</Issn>
				<Volume>10</Volume>
				<Issue>4</Issue>
				<PubDate PubStatus="epublish">
					<Year>2018</Year>
					<Month>12</Month>
					<Day>01</Day>
				</PubDate>
			</Journal>
<ArticleTitle>The Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution</ArticleTitle>
<VernacularTitle></VernacularTitle>
			<FirstPage>1</FirstPage>
			<LastPage>11</LastPage>
			<ELocationID EIdType="pii">72757</ELocationID>
			
<ELocationID EIdType="doi">10.22059/jitm.2019.270013.2324</ELocationID>
			
			<Language>EN</Language>
<AuthorList>
<Author>
					<FirstName>Yumeng</FirstName>
					<LastName>Ye</LastName>
<Affiliation>MSC, Department of Information Quality Program, University of Arkansas at Little Rock, Arkansas, USA.</Affiliation>

</Author>
<Author>
					<FirstName>John</FirstName>
					<LastName>Talburt</LastName>
<Affiliation>Prof., Department of Information Science, University of Arkansas at Little Rock, Arkansas, USA.</Affiliation>

</Author>
</AuthorList>
				<PublicationType>Journal Article</PublicationType>
			<History>
				<PubDate PubStatus="received">
					<Year>2018</Year>
					<Month>11</Month>
					<Day>21</Day>
				</PubDate>
			</History>
		<Abstract>This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise linking decisions, not just the pairwise classifications alone. Part of the problem is that the measures of precision and recall as calculated in data mining classification algorithms such as logistic regression is different from applying these measures to entity resolution (ER) results.. As a classifier, logistic regression precision and recall measure the algorithm’s pairwise decision performance. When applied to ER, precision and recall measure how accurately the set of input references were partitioned into subsets (clusters) referencing the same entity. When applied to datasets containing more than two references, ER is a two-step process. Step One is to classify pairs of records as linked or not linked. Step Two applies transitive closure to these linked pairs to find the maximally connected subsets (clusters) of equivalent references. The precision and recall of the final ER result will generally be different from the precision and recall measures of the pairwise classifier used to power the ER process. The experiments described in the paper were performed using a well-tested set of synthetic customer data for which the correct linking is known. The best F-measure of precision and recall for the final ER result was obtained by substantially increasing the threshold of the logistic regression pairwise classifier.</Abstract>
		<ObjectList>
			<Object Type="keyword">
			<Param Name="value">Entity resolution</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Record linking</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Machine learning</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Logistic regression</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Transitive closure</Param>
			</Object>
		</ObjectList>
<ArchiveCopySource DocType="pdf">https://jitm.ut.ac.ir/article_72757_0af205bf0cf29741afee7e3f17b8062e.pdf</ArchiveCopySource>
</Article>

<Article>
<Journal>
				<PublisherName>Univrsity Of Tehran Press</PublisherName>
				<JournalTitle>Journal of Information Technology Management</JournalTitle>
				<Issn>2980-7972</Issn>
				<Volume>10</Volume>
				<Issue>4</Issue>
				<PubDate PubStatus="epublish">
					<Year>2018</Year>
					<Month>12</Month>
					<Day>01</Day>
				</PubDate>
			</Journal>
<ArticleTitle>Estimating the Parameters for Linking Unstandardized References with the Matrix Comparator</ArticleTitle>
<VernacularTitle></VernacularTitle>
			<FirstPage>12</FirstPage>
			<LastPage>26</LastPage>
			<ELocationID EIdType="pii">72758</ELocationID>
			
<ELocationID EIdType="doi">10.22059/jitm.2019.274871.2332</ELocationID>
			
			<Language>EN</Language>
<AuthorList>
<Author>
					<FirstName>Awaad</FirstName>
					<LastName>Al-Sarkhi</LastName>
<Affiliation>University of Arkansas at Little Rock, USA.</Affiliation>

</Author>
<Author>
					<FirstName>John</FirstName>
					<LastName>R. Talburt</LastName>
<Affiliation>Associate Professor, University of Arkansas at Little Rock, USA.</Affiliation>

</Author>
</AuthorList>
				<PublicationType>Journal Article</PublicationType>
			<History>
				<PubDate PubStatus="received">
					<Year>2019</Year>
					<Month>01</Month>
					<Day>27</Day>
				</PubDate>
			</History>
		<Abstract>This paper discusses recent research on methods for estimating configuration parameters for the Matrix Comparator used for linking unstandardized or heterogeneously standardized references. The matrix comparator computes the aggregate similarity between the tokens (words) in a pair of references. The two most critical parameters for the matrix comparator for obtaining the best linking results are the value of the similarity threshold and the list of stop words to exclude from the comparison. Earlier research has shown that the standard deviation of the token frequency distribution is strongly predictive of how useful stop words will be in improving linking performance. The research results presented here demonstrate a method for using statistics from token frequency distribution to estimate the threshold value and stop word selection likely to give the best linking results. The model was made using linear regression and validated with independent datasets.</Abstract>
		<ObjectList>
			<Object Type="keyword">
			<Param Name="value">Entity resolution</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Record linking</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Matrix comparator</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Stop words</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Token frequency</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">F-measure</Param>
			</Object>
		</ObjectList>
<ArchiveCopySource DocType="pdf">https://jitm.ut.ac.ir/article_72758_72b897868e02c41658fb6caf9ff2f3a8.pdf</ArchiveCopySource>
</Article>

<Article>
<Journal>
				<PublisherName>Univrsity Of Tehran Press</PublisherName>
				<JournalTitle>Journal of Information Technology Management</JournalTitle>
				<Issn>2980-7972</Issn>
				<Volume>10</Volume>
				<Issue>4</Issue>
				<PubDate PubStatus="epublish">
					<Year>2018</Year>
					<Month>12</Month>
					<Day>01</Day>
				</PubDate>
			</Journal>
<ArticleTitle>Framework for Prioritizing Solutions in Overcoming Data Quality Problems Using Analytic Hierarchy Process (AHP)</ArticleTitle>
<VernacularTitle></VernacularTitle>
			<FirstPage>27</FirstPage>
			<LastPage>40</LastPage>
			<ELocationID EIdType="pii">72759</ELocationID>
			
<ELocationID EIdType="doi">10.22059/jitm.2019.274888.2333</ELocationID>
			
			<Language>EN</Language>
<AuthorList>
<Author>
					<FirstName>Imelda Doharta</FirstName>
					<LastName>Aritonang</LastName>
<Affiliation>Department of Information System, Faculty of Computer Science, Universitas Indonesia, Depok, West Java.</Affiliation>

</Author>
<Author>
					<FirstName>Achmad Nizar</FirstName>
					<LastName>Nizar Hidayanto</LastName>
<Affiliation>Prof., Department of Information System, Faculty of Computer Science, Universitas Indonesia, Depok, West Java.</Affiliation>

</Author>
<Author>
					<FirstName>Nur Fitriah</FirstName>
					<LastName>Ayuning Budi</LastName>
<Affiliation>MSc., Department of Information System, Faculty of Computer Science, Universitas Indonesia, Depok, West Java.</Affiliation>

</Author>
<Author>
					<FirstName>Rahmat M.</FirstName>
					<LastName>Samik Ibrahim</LastName>
<Affiliation>MSc., Department of Information System, Faculty of Computer Science, Universitas Indonesia, Depok, West Java.</Affiliation>

</Author>
<Author>
					<FirstName>Solikin</FirstName>
					<LastName>Solikin</LastName>
<Affiliation>MSc., STMIK BIna Insani, Bekasi, Jawa Barat.</Affiliation>

</Author>
</AuthorList>
				<PublicationType>Journal Article</PublicationType>
			<History>
				<PubDate PubStatus="received">
					<Year>2019</Year>
					<Month>01</Month>
					<Day>28</Day>
				</PubDate>
			</History>
		<Abstract>The Central Statistics Agency (BPS) is a government institution that has the authority to carry out statistical activities in the form of censuses and surveys, to produce statistical data needed by the government, the private sector and the general public, as a reference in planning, monitoring, and evaluation of development results. Therefore, providing quality statistical data is very decisive because it will have an impact on the effectiveness of decision making. This paper aims to develop a framework to determine priority of solutions in overcoming data quality problems using the Analytic Hierarchy Process (AHP). The framework is built by conducting interviews and Focus Group Discussion (FGD) on experts to get the interrelationship between problems and solutions. The model that has been built is then tested in a case study, namely the Central Jakarta Central Bureau of Statistics (BPS). The results of the study indicate that the proposed model can be used to formulate solutions to data problems in BPS.</Abstract>
		<ObjectList>
			<Object Type="keyword">
			<Param Name="value">Data quality</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Analytical Hierarchy process</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">AHP</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Central Statistics Agency the Republic of Indonesia</Param>
			</Object>
		</ObjectList>
<ArchiveCopySource DocType="pdf">https://jitm.ut.ac.ir/article_72759_38e5f7689bf9a665b2dc54f471bc159f.pdf</ArchiveCopySource>
</Article>

<Article>
<Journal>
				<PublisherName>Univrsity Of Tehran Press</PublisherName>
				<JournalTitle>Journal of Information Technology Management</JournalTitle>
				<Issn>2980-7972</Issn>
				<Volume>10</Volume>
				<Issue>4</Issue>
				<PubDate PubStatus="epublish">
					<Year>2018</Year>
					<Month>12</Month>
					<Day>01</Day>
				</PubDate>
			</Journal>
<ArticleTitle>Investigating the Role of Code Smells in Preventive Maintenance</ArticleTitle>
<VernacularTitle></VernacularTitle>
			<FirstPage>41</FirstPage>
			<LastPage>63</LastPage>
			<ELocationID EIdType="pii">72760</ELocationID>
			
<ELocationID EIdType="doi">10.22059/jitm.2019.274968.2335</ELocationID>
			
			<Language>EN</Language>
<AuthorList>
<Author>
					<FirstName>Junaid</FirstName>
					<LastName>Ali Reshi</LastName>
<Affiliation>, M.Tech Student, Department of Computer Science and Technology, Central University of Punjab, Bhatinda, Punjab, India.</Affiliation>

</Author>
<Author>
					<FirstName>Satwinder</FirstName>
					<LastName>Singh</LastName>
<Affiliation>Assistant Prof., Department of Computer Science and Technology, Central University of Punjab, Bhatinda, Punjab, India.</Affiliation>

</Author>
</AuthorList>
				<PublicationType>Journal Article</PublicationType>
			<History>
				<PubDate PubStatus="received">
					<Year>2019</Year>
					<Month>01</Month>
					<Day>31</Day>
				</PubDate>
			</History>
		<Abstract>The quest for improving the software quality has given rise to various studies which focus on the enhancement of the quality of software through various processes. Code smells, which are indicators of the software quality have not been put to an extensive study for as to determine their role in the prediction of defects in the software. This study aims to investigate the role of code smells in prediction of non-faulty classes. We examine the Eclipse software with four versions (3.2, 3.3, 3.6, and 3.7) for metrics and smells. Further, different code smells, derived subjectively through iPlasma, are taken into conjugation and three efficient, but subjective models are developed to detect code smells on each of Random Forest, J48 and SVM machine learning algorithms. This model is then used to detect the absence of defects in the four Eclipse versions. The effect of balanced and unbalanced datasets is also examined for these four versions. The results suggest that the code smells can be a valuable feature in discriminating absence of defects in a software.</Abstract>
		<ObjectList>
			<Object Type="keyword">
			<Param Name="value">Preventive maintenance</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Code smells</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Machine learning</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Random forest</Param>
			</Object>
		</ObjectList>
<ArchiveCopySource DocType="pdf">https://jitm.ut.ac.ir/article_72760_08e4a5599636c1e62e341c1d67adff80.pdf</ArchiveCopySource>
</Article>

<Article>
<Journal>
				<PublisherName>Univrsity Of Tehran Press</PublisherName>
				<JournalTitle>Journal of Information Technology Management</JournalTitle>
				<Issn>2980-7972</Issn>
				<Volume>10</Volume>
				<Issue>4</Issue>
				<PubDate PubStatus="epublish">
					<Year>2018</Year>
					<Month>12</Month>
					<Day>01</Day>
				</PubDate>
			</Journal>
<ArticleTitle>Big Data Quality: From Content to Context</ArticleTitle>
<VernacularTitle></VernacularTitle>
			<FirstPage>64</FirstPage>
			<LastPage>71</LastPage>
			<ELocationID EIdType="pii">72762</ELocationID>
			
<ELocationID EIdType="doi">10.22059/jitm.2019.72762</ELocationID>
			
			<Language>EN</Language>
<AuthorList>
<Author>
					<FirstName>Ahmad</FirstName>
					<LastName>Khalilijafarabad</LastName>
<Affiliation>PhD, Department of Information Technology Management, Faculty of Management, University of Tehran, Tehran, Iran.</Affiliation>

</Author>
</AuthorList>
				<PublicationType>Journal Article</PublicationType>
			<History>
				<PubDate PubStatus="received">
					<Year>2019</Year>
					<Month>09</Month>
					<Day>21</Day>
				</PubDate>
			</History>
		<Abstract>Over the last 20 years, and particularly with the advent of Big Data and analytics, the research area around Data and Information Quality (DIQ) is still a fast growing research area. There are many views and streams in DIQ research, generally aiming at improving the effectiveness of decision making in organizations. Although there are a lot of researches aimed at clarifying the role of BIG data quality for organizations, there is no comprehensive literature review that shows the main differences between traditional data quality researches and Big Data quality researches. This paper analyzed the papers published in Big data quality and find out that there is almost no new mainstream about Big Data quality. It is shown in this paper that the main concepts of data quality does not changes in Big Data context and that only some new issues have been added to this area.</Abstract>
		<ObjectList>
			<Object Type="keyword">
			<Param Name="value">Big data</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Big data quality</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Data quality</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Text mining</Param>
			</Object>
		</ObjectList>
<ArchiveCopySource DocType="pdf">https://jitm.ut.ac.ir/article_72762_ade9cf8ca3448807a0edb217c756179d.pdf</ArchiveCopySource>
</Article>

<Article>
<Journal>
				<PublisherName>Univrsity Of Tehran Press</PublisherName>
				<JournalTitle>Journal of Information Technology Management</JournalTitle>
				<Issn>2980-7972</Issn>
				<Volume>10</Volume>
				<Issue>4</Issue>
				<PubDate PubStatus="epublish">
					<Year>2018</Year>
					<Month>12</Month>
					<Day>01</Day>
				</PubDate>
			</Journal>
<ArticleTitle>Perspectives of Big Data Quality in Smart Service Ecosystems (Quality of Design and Quality of Conformance)</ArticleTitle>
<VernacularTitle></VernacularTitle>
			<FirstPage>72</FirstPage>
			<LastPage>83</LastPage>
			<ELocationID EIdType="pii">72763</ELocationID>
			
<ELocationID EIdType="doi">10.22059/jitm.2019.72763</ELocationID>
			
			<Language>EN</Language>
<AuthorList>
<Author>
					<FirstName>Markus</FirstName>
					<LastName>Helfert</LastName>
<Affiliation>Ph.D., Head of Business Informatics Group, Department of Computing, Dublin City University, Dublin, Ireland.</Affiliation>

</Author>
<Author>
					<FirstName>Mouzhi</FirstName>
					<LastName>Ge</LastName>
<Affiliation>Associate Professor, Department of Computer Systems and Communications, Faculty of Informatics, Masaryk University, Brno, Czech Republic.</Affiliation>

</Author>
</AuthorList>
				<PublicationType>Journal Article</PublicationType>
			<History>
				<PubDate PubStatus="received">
					<Year>2019</Year>
					<Month>09</Month>
					<Day>21</Day>
				</PubDate>
			</History>
		<Abstract>Despite the increasing importance of data and information quality, current research related to Big Data quality is still limited. It is particularly unknown how to apply previous data quality models to Big Data. In this paper we review Big Data quality research from several perspectives and apply a known quality model with its elements of conformance to specification and design in the context of Big Data. Furthermore, we extend this model and demonstrate it utility by analyzing the impact of three Big Data characteristics such as volume, velocity and variety in the context of smart cities. This paper intends to build a foundation for further empirical research to understand Big Data quality and its implications in the design and execution of smart service ecosystems.</Abstract>
		<ObjectList>
			<Object Type="keyword">
			<Param Name="value">Big data quality</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Information quality</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Smart cities</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Service design</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Smart services</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Data quality model</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Smart service ecosystem</Param>
			</Object>
		</ObjectList>
<ArchiveCopySource DocType="pdf">https://jitm.ut.ac.ir/article_72763_804dac558197e9e9dc2997c751d2eff9.pdf</ArchiveCopySource>
</Article>
</ArticleSet>
