At the top of the class, I tried two ways to get an instance of the layer:
'This way, exporting to make a copy results in very fast search - 17 seconds to look up 137 addresses
Dim sf_temp As MapWinGIS.Shapefile
sf_temp = Form1.Map1.get_GetObject(Form1.Handle_RoadsLayer) ' this gets layer for searching
sf_temp.SelectAll()
sf = sf_temp.ExportSelection
' this is the alternative to the above creation of temp layer
' runs much to geocode 137 points - 2min 13 sec
sf = Form1.Map1.get_GetObject(Form1.Handle_RoadsLayer) ' this gets layer for searching
It is dramatically slower searching to do the second way. Any thoughts?
I made a unit test to quickly test some variations and I also get different timing.
I need to debug this more.
Here’s my unit test:
var sf = Helper.OpenShapefile("RD12115.shp", true, this);
const string query = @"[NAME] =""la France Ave"" and (([FROMLEFT]<=4201 and [TOLEFT]>=4499) or ([FROMRIGHT]<=4200 and [TORIGHT]>=4498)) and ([ZIP_LEFT] = ""34286"" or [ZIP_RIGHT] = ""34286"")";
Debug.WriteLine(query);
object result = null;
string errorString = null;
var stopWatch = new Stopwatch();
stopWatch.Start();
Assert.IsTrue(sf.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query filebased shapefile: " + stopWatch.Elapsed);
Assert.IsNotNull(result);
var indexes = result as int[];
Assert.IsNotNull(indexes);
Assert.IsTrue(indexes.Length > 0, "No results found");
// Make memory copy:
sf.SelectAll();
var sfTemp = sf.ExportSelection();
sf.SelectNone();
stopWatch.Restart();
Assert.IsTrue(sfTemp.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query memory shapefile: " + stopWatch.Elapsed);
var fileLocation = Path.Combine(Path.GetTempPath(), "speed.shp");
Helper.SaveAsShapefile(sfTemp, fileLocation);
// Again after saving
stopWatch.Restart();
Assert.IsTrue(sfTemp.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query memory shapefile after saving: " + stopWatch.Elapsed);
sfTemp.Close();
// Use QTree
sf.UseQTree = true;
stopWatch.Restart();
Assert.IsTrue(sf.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query shapefile with QTree index: " + stopWatch.Elapsed);
// Use spatial index:
Assert.IsTrue(sf.CreateSpatialIndex(sf.Filename), "Cannot create spatial index");
stopWatch.Restart();
Assert.IsTrue(sf.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query shapefile with spatial index: " + stopWatch.Elapsed);
// Open temp shapefile:
var sfTemp2 = Helper.OpenShapefile(fileLocation);
sfTemp2.UseQTree = true;
stopWatch.Restart();
Assert.IsTrue(sfTemp2.Table.Query(query, ref result, ref errorString), "Table.Query: " + errorString);
stopWatch.Stop();
Debug.WriteLine("Time it took to query temp shapefile with QTree index after loading from disk: " + stopWatch.Elapsed);
Time it took to query filebased shapefile: 00:00:01.2651178
Time it took to query memory shapefile: 00:00:00.2310401
Time it took to query memory shapefile after saving: 00:00:00.6871266
Time it took to query shapefile with QTree index: 00:00:00.7006017
Time it took to query shapefile with spatial index: 00:00:00.7213261
Time it took to query temp shapefile with QTree index after loading from disk: 00:00:01.2230481
I updated the test a bit and noticed that querying the first time a file-based shapefile is always slower than the second time. So some caching must be involved.
The in-memory shapefile stays much faster, but has no difference in the first time or the second time, assuming it is already fully optimized:
Time it took to query filebased shapefile: 00:00:01.1245868
Time it took to query filebased shapefile second time: 00:00:00.6824004
Time it took to ExportSelection: 00:00:01.2607499
Time it took to query memory shapefile: 00:00:00.2366985
Time it took to query memory shapefile second time: 00:00:00.2387093
Time it took to query memory shapefile after saving: 00:00:00.6258544
Time it took to query memory shapefile after saving second time: 00:00:00.2375523
Time it took to query shapefile with QTree index: 00:00:00.6741876
Time it took to query shapefile with QTree index second time: 00:00:00.6730707
Time it took to query shapefile with spatial index: 00:00:00.6897191
Time it took to query shapefile with spatial index second time: 00:00:00.6699897
Time it took to open shapefile: 00:00:00.0165911
Time it took to query temp shapefile with QTree index after loading from disk: 00:00:01.1004431
Time it took to query temp shapefile with QTree index after loading from disk second time: 00:00:00.6669803
A file-based spatial index or a QTree index has no effect, which makes sense because they both are spatial-based and are not indexing attributes.
I tried debugging MapWinGIS to see what is the difference between the file-based shapefile and the in-memory version but couldn’t find anything. Most likely because I’m not a good C++ debugger