This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Abstract: - In location based services (LBS), it is familiar that some query locations are uncertain. This type of uncertainty is generally known as location uncertainty and these queries are called uncertain or imprecise spatial queries. In this paper, we present a GUI based tool for computation of range aggregates over spatial queries whose locations are not precise. These computations involve calculation of distributive aggregate functions such as MIN and MAX for distances between the imprecise query and different spatial data points. We use the aggregate COUNT to count the number of spatial data points within the given query region. To accomplish these tasks, we construct the filtering and verification framework and two efficient filtering techniques MMD and APF. These filtering methods are mainly used to reduce the candidate size. We experiment the tool on various spatial data sets to show the effectiveness of the filters.
Index Terms: - imprecise spatial query, aggregate functions, filtering methods
Uncertainty or imprecision is the key concern in many applications such as sensor databases , environmental databases and location based services. The uncertainty is mainly due to the nature of incomplete data or dimension errors or behavior of moving objects. For example, in location based services (LBS) the query location is imprecise due to the nature of its moving objects. In uncertain applications, every entity follows a probability density function (pdf) . The pdf describes the probability that the entity appears at each position in a multidimensional workspace. Generally we use the normal distribution to evaluate the probability density function.
In recent past, much research on analyzing and processing of imprecise data has been carried out. One of the approaches to efficiently analyze this uncertain data is range query. In our paper, we resolve such problems by designing a GUI based tool for computing range aggregates  over imprecise spatial range queries by using effective filtering techniques. Basically, this problem can be solved by computing falling probability for each data point in the workspace and then apply aggregates on those data points. But the computation of falling probability involves integration function, which is very expensive. To avoid this problem, we demonstrate a filtering and verification framework  and two effective filter methods MMD and APF . The implementation of filters requires computation of simple Euclidean distances from query point/anchor point to the different data points and the comparison of these distances with the query distance. Only the data points that can not be filtered will be verified by computing the falling probabilities.
2. BACKGROUND KNOWLEDGE
Imprecise or uncertain data is the data which varies according to time. Numerous applications make use of many uncertain entities like wind speed, pressure, temperature etc. Sometimes, it is not feasible to store these types of data due to their nature of uncertainty. In those situations, there is also a possibility of message delay or inaccurate or incomplete results by the databases. For example, in sensor databases  the uncertainty is mainly due to the resource limitations such as low battery power or network bandwidth. This leads to the delay of reaching sensor data to the system.
The major problem in location based services is imprecise or uncertain queries . In location based services some users issue different queries based on their present location. Consider an example, "calculate the total number of vehicles within ten meters of my present position" is a location based query, issued based on the user's current location. Generally, the locations of the users are imprecise due to measurement error or message delay or resource limitations. Some recent works have declared that location privacy is also one of the reasons for location based uncertainty.
It is sometimes quite unfeasible for the query processing server to get an accurate location value. Thus it is necessary to have a location based uncertainty model to describe the imprecision. One such model is location based range query. In range queries concept, the query issuer issues a query with some range value. The range query retrieves the data points lying inside a multidimensional region based upon the range. We can calculate the number of data points within the region of a range query.
The basic way to avoid the uncertainty in location based services is computing falling probabilities for all the data points in the workspace. But this computation of falling probability for all the data points in the workspace is very expensive because it involves calculation of integration function. Thus, we use effective filters to efficiently prune the data points from computing complex integrations (i.e. verifications). We can accomplish this by using filtering and verification framework.
3.1. Filtering and Verification Framework
In Filtering and Verification framework , initially we construct an R-Tree  for the set of data points. An entry e of the R-Tree might be a data entry or an intermediate entry. The data entry represents a single point in the data set and an intermediate entry is a set of data entries. Initially, insert the root of R-Tree into an empty Queue.
The filtering and verification algorithm consists of two phases. In filtering phase, for each entry e of R-Tree, we apply the filter to efficiently prune the entry. If e is pruned by filter then the counter will be incremented by the number of data points in e. If e is not pruned by the filter, then check the entry e is a data entry or an intermediate entry. If e is a data entry then put e into V, or else put all the child entries of e into the Queue. The same process will be repeated until the Queue is empty. In verification phase, all the data points in V will be verified by computing their falling probabilities. The formula for falling probability is as follows.
Here Q is an uncertain query, p is the data point and Î³ is the query distance.
3.2. MMD Filter
The MMD (minimal/maximal distance) filter  consists of minimum & maximum distances between the uncertain query point and different data points. A data point is said to be pruned if it satisfies any of the given conditions.
Î´min(Qmbb, embb) > Î³
Î´max(Qmbb, embb) <= Î³
Here Î´max and Î´min represent the maximum and minimum Euclidean distances between the uncertain query (Qmbb) and set of data points (embb). The Î³ represents the query distance.
3.3. Anchor Point Filter (APF)
Input: Data entry e, Query distance Î³, Anchor point 'a' and Prob. threshold Î¸.
Output: pruned or unknown
Compute Î´max (a, embb);
Compute Î´min (a, embb);
If Î³ - Î´max (a, embb)>= UD(Î¸) then
If Î´min (a, embb)+Î³+Ñ” > UD(1-Î¸) then
End if End if
If Î´min (a, embb)-Î³-Ñ”<= LD(Î¸) then
If Î´max (a, embb) < LD(Î¸) - Î³ then
UPfall (embb, Î³) < Î¸ then
The Anchor Point Filter (APF)  is an efficient filtering technique which reduces the candidate size. APF is employed based upon the pre computed anchor points. The anchor points are the pivot values whose falling probabilities for different query distance (Î³) values are pre computed. We can prune or validate a data point based on its distance to the anchor point. If the number of anchor points is more then the filtering capacity of APF will also be more.
Here UD(Î¸) and LD(Î¸) represents the minimum and maximum distances from the query point to different anchor points respectively and Ñ” is a small constant value. In APF algorithm, initially we compute the maximum and minimum Euclidean distances between the anchor point and different data points. Now we compare this min and max distances with the UD(Î¸) and LD(Î¸) to efficiently prune the data points.
4. EXPERIMENTAL EVALUATION
We conduct the experiments on both synthetic and real spatial data sets to evaluate the effectiveness of the filters. In this project we consider a real spatial data set, 'United States' and a sample synthetic spatial data set. The experiments prove that the Anchor Point Filter (APF) is more effective than the MMD filter with less candidate size and more filtering capacity. The experimental platform is 2.3GHz core i5 processor with 4 GB RAM and Windows 7 operating system.
Figure 4.1: Candidate Size Graph for the real spatial data set United States.
Figure 4.2: Candidate Size Graph for the synthetic spatial data set Sample.
The graphs 4.1 and 4.2 show that the APF has lower candidate set size when compared to MMD. Here the candidate set size represents the number of data points that need verification.
Thus we can say that the Anchor Point Filter is the more effective and efficient filter than the MMD filter.
5. CONCLUSION and FUTURE WORK
In this paper, first we demonstrate the problem of imprecise queries and then the effect of imprecise queries on spatial databases. We resolve this problem by designing a GUI based tool for computing range aggregates using effective filtering techniques. To accomplish this, we propose a filtering and verification frame work along with two effective filters MMD and APF. The experiment evaluation shows the effectiveness of the two filters. The future scope of the project is aimed to develop the efficient and effective filters to decrease the candidate set size.