Monday, February 6, 2012

What are the practical problems of software effort data sets?

In fact this post is an adaptation of my replies to a reviewer in a journal submission. The reviewer was concerned about the age of some of the software effort data sets. That was really a very valid concern, thinking that some of the most widely used data sets can be as old as 30+ years.

This made me thinking about some other problems related to software effort data sets and I identified 4 main issues. Those issues were usually discussed in the conferences between researchers, but I had never officially complained about them in one of my papers. So hopefully if this paper (fingers crossed) gets accepted, then we will see the below issues expressed regarding the software effort data sets.

1) Why collect data: For most companies, putting together scattered information into a software effort data set requires extra effort. So it is a serious issue for a practitioner to spend his already limited time in data collection activities. Therefore, unless user groups or practitioners in companies are given explicit outcomes for their effort, this difficulty is unlikely to disappear.

2) Data Privacy: Another concern for the companies is exposing their proprietary information. This issue calls for data anonymization techniques in software effort estimation.  A company that is convinced that its proprietary information is securely altered, is more likely to make their data publicly available.

3) Proprietary-right Period: This is quite a delicate issue with lots of pro's and con's. For research labs, sharing data can mean giving up on the exclusive right to publish results from that data set. Thinking the "publish or perish" saying of the academics, the reluctance to share the data is understandable. On the other hand, our ultimate goal is to bring software engineering one step ahead, one paper at a time... The solution to that problem can be to define a proprietary-right period. For example the science of astronomy has good solution examples to that problem.

4) Data Set Aging: The age of the data sets is another concern that lacks consensus. From a practitioner’s point of view, there is limited value in using old data sets as they may no longer represent current standards. On the other hand, for a researcher -like myself- not using old but widely-used data sets would create quite a lot of problems during the review process of a paper.. The easiest solution to that problem seems like deciding on a validity period. However, unless the software effort estimation field is supplemented with new data sets, I believe use of older data sets will be standard practice for most of the researchers.

28 comments:

  1. Great in sequence! There is something wonderful about "What are the practical problems of software effort data sets?". I am fearful by the excellence of information on this website.
    I think,
    There are a bundle of good quality resources here. I am sure I will visit this place another time soon.

    ReplyDelete
    Replies
    1. Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. big data projects for students But it’s not the amount of data that’s important.Project Center in Chennai

      Python Training in Chennai Python Training in Chennai The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training Project Centers in Chennai

      Delete
  2. A very nice and good post this. I really like it very much. Keep this quality of your work on articles going on and please do not let the quality of your articles fall to bad. Cheers! competitor price monitoring software

    ReplyDelete
  3. Here is a detailed overview of 18 Productive http chrome flags for Mobile and PC Users and read all interesting details about them.

    ReplyDelete
  4. Be that as it may, the need of great importance is to burrow further and reveal more extravagant experiences like client conduct, designs and so on. Data Analytics Course

    ReplyDelete
  5. I love this blog!! The flash up the top is awesome!! Sales pipeline software

    ReplyDelete
  6. Excellent .. Amazing .. I’ll bookmark your blog and take the feeds also…I’m happy to find so many useful info here in the post, we need work out more techniques in this regard, thanks for sharing. conversion tracking software

    ReplyDelete
  7. I just got to this amazing site not long ago. I was actually captured with the piece of resources you have got here. Big thumbs up for making such wonderful blog page!

    data science course

    ReplyDelete
  8. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  9. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  10. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.

    Simple Linear Regression

    Correlation vs covariance

    KNN Algorithm

    Logistic Regression explained

    ReplyDelete
  11. I have to search sites with relevant information on given topic and provide them to teacher our opinion and the article.

    data science interview questions

    ReplyDelete
  12. very well explained. I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple Linear Regression
    data science interview questions
    KNN Algorithm
    Logistic Regression explained

    ReplyDelete
  13. This is a wonderful article, Given so much info in it, These type of articles keeps the users interest in the website, and keep on sharing more ... good luck.

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  14. Impressive blog to be honest definitely this post will inspire many more upcoming aspirants. Eventually, this makes the participants to experience and innovate themselves through knowledge wise by visiting this kind of a blog. Once again excellent job keep inspiring with your cool stuff.

    Data Science certification in Raipur

    ReplyDelete
  15. Terrific post thoroughly enjoyed reading the blog and more over found to be the tremendous one. In fact, educating the participants with it's amazing content. Hope you share the similar content consecutively.

    Data Analytics training in Raipur

    ReplyDelete
  16. Highly appreciable regarding the uniqueness of the content. This perhaps makes the readers feels excited to get stick to the subject. Certainly, the learners would thank the blogger to come up with the innovative content which keeps the readers to be up to date to stand by the competition. Once again nice blog keep it up and keep sharing the content as always.

    Data Science Course in Bhilai

    ReplyDelete
  17. Honestly speaking this blog is absolutely amazing in learning the subject that is building up the knowledge of every individual and enlarging to develop the skills which can be applied in to practical one. Finally, thanking the blogger to launch more further too.

    Data Analytics online course

    ReplyDelete
  18. Really I enjoy your site with effective and useful information. It is included very nice post with a lot of our resources.thanks for share. i enjoy this post. download do HappyMod apk atualizado

    ReplyDelete
  19. WhatsApp Group Links
    I have read a few of the articles on your website now, and I really like your style.
    WhatsApp Group Links

    ReplyDelete


  20. Great to become visiting your weblog once more, it has been a very long time for me. Pleasantly this article i've been sat tight fosuch a long time. I will require this post to add up to my task in the school, and it has identical subject along with your review. Much appreciated, great offer. data science course in nagpur

    ReplyDelete
  21. Impressive blog to be honest definitely this post will inspire many more upcoming aspirants. Eventually, this makes the participants to experience and innovate themselves through knowledge wise by visiting this kind of a blog. Once again excellent job keep inspiring with your cool stuff.

    Data Science Training in Bhilai

    ReplyDelete
  22. Wonderful blog found to be very impressive to come across such an awesome blog. I should really appreciate the blogger for the efforts they have put in to develop such an amazing content for all the curious readers who are very keen of being updated across every corner. Ultimately, this is an awesome experience for the readers. Anyways, thanks a lot and keep sharing the content in future too.

    Data Science Course in Bhilai

    ReplyDelete
  23. Extremely overall quite fascinating post. I was searching for this sort of data and delighted in perusing this one. Continue posting. A debt of gratitude is in order for sharing. data analytics course in delhi

    ReplyDelete