I have a vector that looks like this:
y =
Columns 1 through 19:
1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2
Columns 20 through 38:
2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4
Columns 39 through 57:
4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 6
Columns 58 through 67:
6 6 6 6 6 6 6 6 6 6
The vector y
is always start at 1 and be counted up. You see that there are lots of same numbers. It's the classes for the samples.
Here we have 1 1 1 1 1 1 1 1 1 1 1 1
= 12 samples for class number 1.
We have 2 2 2 2 2 2 2 2 2 2 2
= 11 samples for class number 2.
My problem here is that I want to find start and stop for every class. For example: Class 1 begins always at index 0 and ends, in this case, at index 11.
Class 2 begins directly after class 1 ends.
Question:
I'm using EJML (Effient Java Matrix Library) and I'm planning to use this function:
C = A.extractMatrix(1,4,2,8)
Which is equal to this MATLAB code:
C = A(2:4,3:8)
But I need to find the start and stop indexes from this y
vector. In what index does e.g class 3 stops and starts? Do you have any smart ideas how to do that?
Sure, I could use a for-loop, to do this, but for-loops in Java is quite slow because I'm going to have a very very large y
vector.
Suggestions?
Edit:
Here is an suggestion. Is that good, or could it be done better?
private void startStopIndex(SimpleMatrix y, int c, Integer[] startStop) {
int column = y.numCols();
startStop[0] = startStop[1] + 1; // Begin at the next class
for(int i = startStop[0]; i < column; i++) {
if(y.get(i) != c) {
break;
}else {
startStop[1] = i;
}
}
}
Assuming that we are calling the method from:
Integer[] startStop = new Integer[2];
for(int i = 0; i < c; i++) {
startStopIndex(y, c, startStop);
}
If you want to do it faster then binary search is your friend. Threw this together really quick and it does things in O(log n) time, where as a linear search does it in O(n). It's pretty basic and assumes your data looks pretty much like you describe it. Feed it weird data and it will break.:
You'd call it like this: