From: Stefan Weiss on
Does anybody here know of a tool which can count the number of source
files and lines of code used in a project, and will parse JS files as
well as the more commonly supported languages (C, C++, Java, Python,
Perl, etc)? I've been using SLOCCount [0] since 2005, and it's a nice
toy, but it hasn't been maintained since its release, and it still
ignores JS files, SQL stored procedures, and many others. In some
projects, these file types make up more than half of the code, so I'm
looking for alternatives.

Before anybody objects to the use of LOC as a code metric: yes, I'm
fully aware that this approach is futile. A project with 1000 lines of
PHP will have more than double the LOC when implemented in C, and only
one line, or _maybe_ two lines, if written in Perl. Not to mention
differences in coding style and all the other factors which can't be
measured by counting lines (structure, comments, readability, copypasta,
research, and so on). In the rare case where two projects are similar
enough, the LOC count could theoretically be used to compare them, but
that's about the limit of its usefulness.

I'm not looking for accuracy or correctness here. I just like to see the
numbers grow. I run SLOCCount about once a week, usually at the end of a
long coding session, and it gives me a warm and fuzzy feeling before I
go home. For example, the project I'm currently working on is a small
web application, written from scratch. After 2 weeks, SLOCCount gives
the following report:

[snip details]
Total Physical Source Lines of Code (SLOC) = 4,673
Development Effort Estimate, Person-Years (Person-Months) = 1.01 (12.11)
(Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months) = 0.54 (6.45)
(Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule) = 1.88
Total Estimated Cost to Develop = $ 136,369
(average salary = $56,286/year, overhead = 2.40).

I should probably adjust the model's parameters before running the tool,
but since the amounts are pure fantasy anyway, I just run it with the
defaults. There's no way that this little project is worth anywhere near
$100k; it's not even in the same magnitude. Still, I like to see the
numbers increasing. It's also interesting to see what percentage of the
whole the different languages make up (I snipped this part).

And that's the problem: the tool arrived at 4673 LOC, but completely
ignored input like JS, SQL, and executable scripts without an extension;
not to speak of CSS and HTML (which aren't programming languages, but do
contribute to the overall effort).

I've seen some other code analyzers mentioned [1], but most of them
don't handle JS. The ones that do are too expensive to be used for fun
[2]. If any of you have worked with similar tools, I'd be interested to
hear about your experiences.


[0] http://www.dwheeler.com/sloccount/
[1] http://www.locmetrics.com/alternatives.html
[2] http://www.jamesheiresconsulting.com/Order%20Form.htm


PS: Again, just in case: I only use SLOCCount for fun. I'm not taking
the numbers seriously. I suppose I could try to impress clients with the
$$ figures, but I'd consider that lying.


--
stefan
From: Garrett Smith on
On 5/25/2010 5:30 PM, Stefan Weiss wrote:
> Does anybody here know of a tool which can count the number of source
> files and lines of code used in a project, and will parse JS files as
> well as the more commonly supported languages (C, C++, Java, Python,
> Perl, etc)? I've been using SLOCCount [0] since 2005, and it's a nice
> toy, but it hasn't been maintained since its release, and it still
> ignores JS files, SQL stored procedures, and many others. In some
> projects, these file types make up more than half of the code, so I'm
> looking for alternatives.
>
> Before anybody objects to the use of LOC as a code metric: yes, I'm
> fully aware that this approach is futile. A project with 1000 lines of
> PHP will have more than double the LOC when implemented in C, and only
> one line, or _maybe_ two lines, if written in Perl. Not to mention
> differences in coding style and all the other factors which can't be
> measured by counting lines (structure, comments, readability, copypasta,

Don't overdo the carbs.

> research, and so on). In the rare case where two projects are similar
> enough, the LOC count could theoretically be used to compare them, but
> that's about the limit of its usefulness.
>
> I'm not looking for accuracy or correctness here. I just like to see the
> numbers grow.

I'm the exact opposite. I love it when the SLOC decreases.

I'm more interested in analyzing dependencies. Sorry, I can't help you.
From: nick on
On May 25, 8:30 pm, Stefan Weiss <krewech...(a)gmail.com> wrote:
> Does anybody here know of a tool which can count the number of source
> files and lines of code used in a project [...]

I'd probably use some combination of `find` and `wc` for that... might
be worth looking into?
From: Stefan Weiss on
On 26/05/10 03:37, nick wrote:
> On May 25, 8:30 pm, Stefan Weiss <krewech...(a)gmail.com> wrote:
>> Does anybody here know of a tool which can count the number of source
>> files and lines of code used in a project [...]
>
> I'd probably use some combination of `find` and `wc` for that... might
> be worth looking into?

I think you'd have to actually parse the source files. LOC counts
usually don't include blank lines or comments.
Removing comments from a JS file with grep/sed/awk/etc is not a trivial
task. sed and awk are Turing complete, which means it must be possible,
but I'd rather not try :)


--
stefan
From: nick on
On May 25, 9:45 pm, Stefan Weiss <krewech...(a)gmail.com> wrote:
> On 26/05/10 03:37, nick wrote:
>
> > On May 25, 8:30 pm, Stefan Weiss <krewech...(a)gmail.com> wrote:
> >> Does anybody here know of a tool which can count the number of source
> >> files and lines of code used in a project [...]
>
> > I'd probably use some combination of `find` and `wc` for that... might
> > be worth looking into?
>
> I think you'd have to actually parse the source files. LOC counts
> usually don't include blank lines or comments.
> Removing comments from a JS file with grep/sed/awk/etc is not a trivial
> task. sed and awk are Turing complete, which means it must be possible,
> but I'd rather not try :)

`cpp` will remove the comments... it leaves behind some other junk,
but that's easily removed with sed.