These days we were asked to do a quick analysis of an existing code base. Since we had no idea of the code base at all we started with gathering some numbers first.
What’s inside the root directory of the project?
$ ls -1
documentation
php-gui
php-parser
scripts
…
At first glance the primary programming language is PHP.
How many files are in the repository / sub-directories ?
$ find . -type f | wc -l
17780
$ for file in $(ls) ; do ; echo $file ; find $file -type f | wc -l ; done
documentation
770
php-gui
11834
php-parser
140
scripts
186
…
Ok. So the majority of source files seems to be inside php-gui
and some in php-parser
.
We’ve got some documentation and scripts.
Same query for better readability as shell script:
for file in $(ls)
do
echo $file
find $file -type f | wc -l
done
How many different file types does the repository contain?
We are not the first interested in such kind of information. List all unique extensions for files contained in a directory
$ find . -type f | sed -E 's/.+[\./]([^/\.]+)/\1/' | sort -u | wc -l
240
Impressive, but not a very useful information.
The main language seems to be PHP. How many PHP source files are in php-parser
and php-gui
?
$ find . -type f -name "*.php" ¦ wc -l
4590
More than 4500 PHP source files in php-gui
.
Any PHPUnit tests in
php-parser
orphp-gui
?
$ find php-parser -type f -name "*php" -exec grep -H -n 'phpunit' {} \;
negative. In the Zend Framework powered php-gui
?
$ find php-gui -type f -name "*php" -exec grep -H -n 'phpunit' {} \; | grep -v ZendFramework | wc -l
3
Ok
How large are those source files (excluding the big input files)?
$ ls -lR **/*.php | grep '^-' | grep -v input | sort -k 5 -rn | head
…
I bet you’ve got other/more ideas to gather first insights of an existing project - just leave a comment below…thanks!