Ekrem Kocaguneli: What are the practical problems of software effort data sets?

Monday, February 6, 2012

What are the practical problems of software effort data sets?

In fact this post is an adaptation of my replies to a reviewer in a journal submission. The reviewer was concerned about the age of some of the software effort data sets. That was really a very valid concern, thinking that some of the most widely used data sets can be as old as 30+ years.

This made me thinking about some other problems related to software effort data sets and I identified 4 main issues. Those issues were usually discussed in the conferences between researchers, but I had never officially complained about them in one of my papers. So hopefully if this paper (fingers crossed) gets accepted, then we will see the below issues expressed regarding the software effort data sets.

1) Why collect data: For most companies, putting together scattered information into a software effort data set requires extra effort. So it is a serious issue for a practitioner to spend his already limited time in data collection activities. Therefore, unless user groups or practitioners in companies are given explicit outcomes for their effort, this difficulty is unlikely to disappear.

2) Data Privacy: Another concern for the companies is exposing their proprietary information. This issue calls for data anonymization techniques in software effort estimation. A company that is convinced that its proprietary information is securely altered, is more likely to make their data publicly available.

3) Proprietary-right Period: This is quite a delicate issue with lots of pro's and con's. For research labs, sharing data can mean giving up on the exclusive right to publish results from that data set. Thinking the "publish or perish" saying of the academics, the reluctance to share the data is understandable. On the other hand, our ultimate goal is to bring software engineering one step ahead, one paper at a time... The solution to that problem can be to define a proprietary-right period. For example the science of astronomy has good solution examples to that problem.

4) Data Set Aging: The age of the data sets is another concern that lacks consensus. From a practitioner’s point of view, there is limited value in using old data sets as they may no longer represent current standards. On the other hand, for a researcher -like myself- not using old but widely-used data sets would create quite a lot of problems during the review process of a paper.. The easiest solution to that problem seems like deciding on a validity period. However, unless the software effort estimation field is supplemented with new data sets, I believe use of older data sets will be standard practice for most of the researchers.

9 comments:

UnknownJuly 19, 2013 at 8:37 AM
Great in sequence! There is something wonderful about "What are the practical problems of software effort data sets?". I am fearful by the excellence of information on this website.
I think,
There are a bundle of good quality resources here. I am sure I will visit this place another time soon.
ReplyDelete
Replies
mubeen shahjahanOctober 21, 2019 at 4:38 AM
A very nice and good post this. I really like it very much. Keep this quality of your work on articles going on and please do not let the quality of your articles fall to bad. Cheers! competitor price monitoring software

ReplyDelete
Replies
Muhammad HassanDecember 8, 2019 at 2:33 AM
I love this blog!! The flash up the top is awesome!! Sales pipeline software
ReplyDelete
Replies
ayanMarch 16, 2020 at 10:30 PM
Excellent .. Amazing .. I’ll bookmark your blog and take the feeds also…I’m happy to find so many useful info here in the post, we need work out more techniques in this regard, thanks for sharing. conversion tracking software
ReplyDelete
Replies
Mehak KhanMay 11, 2021 at 4:07 AM
Really I enjoy your site with effective and useful information. It is included very nice post with a lot of our resources.thanks for share. i enjoy this post. download do HappyMod apk atualizado
ReplyDelete
Replies
UnknownJuly 23, 2021 at 1:35 PM
instagram takipçi satın al
aşk kitapları
tiktok takipçi satın al
instagram beğeni satın al
youtube abone satın al
twitter takipçi satın al
tiktok beğeni satın al
tiktok izlenme satın al
twitter takipçi satın al
tiktok takipçi satın al
youtube abone satın al
tiktok beğeni satın al
instagram beğeni satın al
trend topic satın al
trend topic satın al
youtube abone satın al
instagram takipçi satın al
beğeni satın al
tiktok izlenme satın al
sms onay
youtube izlenme satın al
tiktok beğeni satın al
sms onay
sms onay
perde modelleri
instagram takipçi satın al
takipçi satın al
tiktok jeton hilesi
instagram takipçi satın al
pubg uc satın al
sultanbet
marsbahis
betboo
betboo
betboo
ReplyDelete
Replies
Data Science Course in Bhilai - 360DigiTMGDecember 7, 2021 at 6:38 PM
Must read article by everyone including a valid content which educate folks with its unique blog. Thus, exploring the readers curiosity in gaining a valuable content. Thanks to the blogger.

Data Science Training in Bhilai
ReplyDelete
Replies
360DigiTMGFebruary 14, 2022 at 12:44 AM
I wanted to leave a little comment to support you and wish you a good continuation. Wishing you the best of luck for all your blogging efforts.
data science online training in hyderabad
ReplyDelete
Replies
Kiefer AuctioneersJanuary 6, 2025 at 10:10 PM
When it comes to the excitement of bidding and acquiring unique items, farm auctions Texas offer a fantastic opportunity for enthusiasts and collectors alike. These events feature a diverse range of goods, from agricultural equipment and livestock to antiques and farm memorabilia. Each auction presents an enticing chance to find hidden treasures while engaging with the local community.

The atmosphere is often vibrant and filled with enthusiasm, as attendees gather to witness the competitive spirit of the bidding process. Whether you're looking to upgrade your farming tools or simply enjoy the thrills of an auction, Texas has something for everyone.
ReplyDelete
Replies

Add comment

Pages

Monday, February 6, 2012

What are the practical problems of software effort data sets?

9 comments: