Postby noseyparker » Fri Jul 15, 2016 4:48 am

While investigating hdd-images, AutoPsy collects extremely large amount of data. Ofcourse, I would like to save a report containing all important details using most popular of supported formats. That is why I choosed HTML Report.
1) But the size of single page "Keyword_hits.html" is more than 150(!!!)MB. It causes hanging of any web-browsers.
I attempted to parse this page with HTMLAgilityPack in Visual Studio 2015 IDE. Opened WebDocument takes over 6GB of RAM.
So, could you please split big html files? For example, by new Autopsy option "maxrowscount" or by <h1>-tag.

2) Finally I've seen the report pages. But some of them contain unexpected codes instead of normal characters. For example, if filepath contained cyrillic symbols, all of them will be replaced in report. These pages are: "Web Bookmarks", "Recent Documents" and possibly some others.
Same trouble detected in regripper's report. I think AutoPsy doesn't check codepages of including data and this leads to wrong HTML Report's data displaying. Fix it please!

P.s. It is possible to get repeating results. For example, using Regex "(ftp\:\/\/|ssh)" without 'ignorecase'-flag, we'll get same results for subgroups 'ssh', 'ssH', 'sSh','sSH', 'Ssh' etc. :idea: Could you please add functions to exclude duplicates/exclude exact results(s)/exclude datasource and related results.

Thanks a lot!
