Dawn of the Dev: Statistical Graphing with R

Monday, 28 May 2012

Statistical Graphing with R

I have been collecting some metrics from the 8 agile teams that I work with and planned to create some nice visual reports to help identify software development practices that could be improved.

The data was initially in an Excel spreadsheet, so I used its built-in graphing capabilities which had been good enough for previous projects. I quickly found problems that made me want to look for a better solution;

Difficult to create multiple graphs with aligned x-axes.
Every time I wanted to use the same graph with different data, I’d have to either replace the data or tweak every graph.

So I tried Gnumeric, which has some nice graphing features, but its lack of pivot tables puts it out of the running. Libre Office Calc was much the same as Excel. I tried some Linux graphical plotting applications, the nicest of which was QTIPlot, which solved the common X-axis problem, but its vector output was poor.

Eventually I looked at statistical computing environments and settled on the R Project, a programming language for statistical analysis and graphing. It solved all the problems I was having with GUI based tools and in the process introduced me to a new way of explore my data.

R has a shell, so you can load some data into a variable and then start playing with it. You can easily filter data, apply matrix transformations and feed data through mapping functions.
Once you have the data in the shape you’re interested in, you can run it through one of the built-in functions, which give you lots of standard graph types, or delve into The Comprehensive R Archive Network (CRAN) which is a massive library of user contributed functions for graphing and analysis.

Once you have settled on the transformations and graphs you want to write out, you can write a script that outputs to various file formats, including PDF, SVG and Postscript. At the end of each iteration, I run my script that takes the latest data as a CSV file and outputs all the graphs I need for my report. When I think of new graphs I’d like to include, I add them to my script. It’s easy, reproducible, massively flexible and oh yes… it’s open source too.

Here’s a quick distribution graph (took maybe five minutes)…

This distribution of story estimates (in man-days) shows that;

Even numbers are more popular than odd numbers
9, 11, 17 and 19 are never chosen, presumably due to rounding up or down to 10 and 20.
There is a general preference for nice small stories.

All this was created with the following R script;


data <- read.table("data.csv", header=TRUE, sep=",")

small_quotes <- data$Original.Estimate[
  data$Original.Estimate <= 576000 & 
  data$Original.Estimate > 0] / 28800

hist(
  small_quotes, col=rainbow(40,0.5),
  main=paste("Histogram of ", length(small_quotes),
    "Quote Sizes <= 20 units"),
  breaks=20, xlab="Estimate", ylab="Frequency"
)

Have fun setting your data free!

2 comments:

Unknown26 October 2015 at 19:52
đồng tâm
game mu
cho thuê nhà trọ
cho thuê phòng trọ
nhac san cuc manh
số điện thoại tư vấn pháp luật miễn phí
văn phòng luật
tổng đài tư vấn pháp luật
dịch vụ thành lập công ty trọn gói
lý thuyết trò chơi trong kinh tế học
đức phật và nàng audio
hồ sơ mật dinh độc lập audio
đừng hoang tưởng về biển lớn ebook
chiến thắng trò chơi cuộc sống ebook
bước nhảy lượng tử
ngồi khóc trên cây audio
truy tìm ký ức audio
mặt dày tâm đen audio
thế giới như tôi thấy ebook

“Ngươi có cách???” Lưu Phong hồi phục tinh thần lại hỏi.

Vương Bảo Nhi cười hắc hắc nói “Đại ca, không dối gạt ngươi, giáo phường tổng lĩnh thái giám cũng thường xuyên bán thê tử, con gái của tội thần, dĩ nhiên là với giá cao. Một năm trước, ta cũng đã mua một người, sau khi xơi chán, liền bán cho kỹ viện, bây giờ đã trở thành kỹ nữ bậc nhất.

Cầm thú, bại hoại,, Lưu Phong liếc ánh mắt khinh bỉ nhìn Vương Bảo Bảo nói:“Nếu ngươi đã có cách để tiến hành giao dịch thì tốt nhất là nên mua phụ nữ trẻ, chưa tiếp khách nhiều.”

“Đại ca, ta sợ ngân lượng không đủ, gần đây giáo phường ra giá tương đối cao, ngươi xem ngươi có thể xuất ra 50 vạn lượng bạc không?” Vương Bảo Nhi hơi khó khăn hỏi.

“50 vạn???” Lưu Phong mấy ngày hôm nay từ Túy Xuân Lâu thu hoạch cũng khoảng 10 vạn lượng, còn 40 vạn chắc phải đi tìm Tố Nương hỏi xem, nhưng chắc cũng không phải là vấn đề lớn lắm.

“5 vạn lượng, không có vấn đề gì, ta có thể cấp cho ngươi.”

Đợi tất cả mọi chuyện thương nghị đâu đấy đều hoàn tất, cũng đã tới nửa đêm, Lưu Phong cũng không muốn trở về, lệnh cho tú bà Túy xuân Lâu chuẩn bị cho hắn một gian phòng để nghỉ ngơi một đêm.

Sáng sớm ngày thứ hai, sau khi rời khỏi giường, đang định trở lại Phượng viên, vừa ra tới cửa đã gặp Bạch Thiên Hành.

“Lưu huynh, ta đang muốn tìm ngươi đây?”

ReplyDelete
Replies
Tanika Co Valda25 June 2019 at 04:40

Great Article
R Project Topics for Computer Science
FInal Year Project Centers in Chennai

JavaScript Training in Chennai
JavaScript Training in Chennai

ReplyDelete
Replies

Add comment