I have this dataframe in R. It has the structure of a pedigree dataframe, with the id, fid, mid and sex columns.
pedigree <- structure(list(id = c(212, 214, 263, 266, 273, 274, 275, 279,
280, 281, 286, 287, 312, 313, 314, 315, 316, 317, 318, 319, 320,
321, 322, 323, 324, 325, 326, 327, 332, 333, 334, 335, 336, 337,
338, 339, 340, 341, 346, 347, 348, 349, 389, 390, 391, 392, 413,
414, 415, 416, 466, 475, 476, 477, 478, 479, 480, 483, 486, 487,
491, 492, 493, 494, 498, 501, 502, 506, 507, 508, 509, 510, 511,
512, 513, 514, 518, 519, 542, 543, 544, 545, 546, 547, 551, 552,
553, 554, 555, 556, 564, 565, 568, 569, 570, 575, 576, 579, 580,
584, 585, 586, 589, 590, 593, 595, 596, 597, 598, 599, 614, 615,
616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 653, 654, 662,
663, 671, 672, 673, 674, 675, 676, 681, 682, 683, 684, 688, 689,
693, 694, 695, 696, 697, 698, 701, 702, 703, 704, 709, 710, 715,
716, 718, 720, 721, 722, 723, 724, 725, 726, 727, 730, 731, 736,
737, 738, 739, 740, 744, 745, 842, 843, 874, 875, 884, 885, 886,
887, 889, 890, 894, 895, 896, 897, 898, 903, 905, 906, 907, 908,
909, 910, 911, 912, 913, 914, 915, 917, 925, 926, 927, 928, 929,
931, 932, 936, 965, 999, 1000, 1006, 1007, 1041, 1043, 1044,
1046, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1099, 1100,
1101, 1321, 1322, 1368, 1551, 1552, 1553, 1554, 1555), fid = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 326, 326, 326, 326, 279, 320, 320, 320, 320, 320, 320,
320, 320, 320, 324, 324, 324, 324, 322, 322, 322, 324, 324, 324,
324, 324, 324, 324, 324, 324, 318, 318, 326, 326, 326, 326, 326,
326, 326, 326, 326, 326, 326, 326, 332, 332, 287, 287, 287, 287,
287, 286, 286, 346, 346, 346, 348, 348, 348, 326, 326, 326, 326,
326, 332, 332, 320, 320, 320, 320, 320, 287, 346, 346, 346, 346,
273, 273, 273, 273, 266, 334, 334, 334, 334, 334, 336, 336, 336,
336, 336, 336, 334, 334, 334, 334, 334, 334, 338, 338, 338, 338,
340, 340, 340, 338, 338, 334, 334, 334, 334, 334, 334, 334, 334,
314, 314, 314, 314, 314, 314, 314, 312, 312, 0, 0, 286, 286,
314, 314, 314, 314, 314, 314, 334, 334, 334, 334, 334, 389, 389,
389, 389, 389, 389, 389, 389, 389, 389, 389, 389, 338, 332, 332,
332, 332, 332, 332, 332, 346, 274, 391, 391, 391, 391, 0, 0,
0, 0, 316, 316, 316, 316, 316, 316, 316, 316, 842, 842, 842,
1041, 1041, 1041, 1043, 1043, 1043, 1043, 1043), mid = c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 327, 327, 327, 327, 275, 321, 321, 321, 321, 321, 321,
321, 321, 321, 325, 325, 325, 325, 323, 323, 323, 325, 325, 325,
325, 325, 325, 325, 325, 325, 319, 319, 327, 327, 327, 327, 327,
327, 327, 327, 327, 327, 327, 327, 333, 333, 212, 212, 212, 212,
212, 214, 214, 347, 347, 347, 349, 349, 349, 327, 327, 327, 327,
327, 333, 333, 321, 321, 321, 321, 321, 212, 347, 347, 347, 347,
281, 281, 281, 281, 263, 335, 335, 335, 335, 335, 337, 337, 337,
337, 337, 337, 335, 335, 335, 335, 335, 335, 339, 339, 339, 339,
341, 341, 341, 339, 339, 335, 335, 335, 335, 335, 335, 335, 335,
315, 315, 315, 315, 315, 315, 315, 313, 313, 0, 0, 214, 214,
315, 315, 315, 315, 315, 315, 335, 335, 335, 335, 335, 390, 390,
390, 390, 390, 390, 390, 390, 390, 390, 390, 390, 339, 333, 333,
333, 333, 333, 333, 333, 347, 280, 392, 392, 392, 392, 0, 0,
0, 0, 317, 317, 317, 317, 317, 317, 317, 317, 843, 843, 843,
1044, 1044, 1044, 1046, 1046, 1046, 1046, 1046), sex = structure(c(1L,
1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L), levels = c("1", "2"), class = "factor")), row.names = c(NA,
-234L), class = c("tbl_df", "tbl", "data.frame"))
This is the structure, where there are 234 individuals:
str(pedigree)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 234 obs. of 4 variables:
$ id : num 212 214 263 266 273 274 275 279 280 281 ...
$ fid: num 0 0 0 0 0 0 0 0 0 0 ...
$ mid: num 0 0 0 0 0 0 0 0 0 0 ...
$ sex: Factor w/ 2 levels "1","2": 1 1 1 2 2 2 1 2 1 1 ...
I am trying to do a pedigree analysis by using pedtools.
In order to convert this dataframe into a ped object, I use this as.ped(pedigree) function.
However, I see this malformed pedigree information:
as.ped(pedigree)
Error: Malformed pedigree.
Individual 287 is female, but appear as the father of 568
Individual 212 is male, but appear as the mother of 568
I checked the ids 568, 287 and 212, but everything is properly assigned. This means that 287is the mother of 568 (it is included in fid) and similarly with 212, who is the father of 568 (and is included in mid).
As a convention, 1 refers to males and 2 to females.
What might be happening?
Looking at your dataset, the record for 568 states
287 is in the fid column, not the mid column as you state. There is an error somewhere in the data (either fid and mid have been switched here, or the sex value of 287 and 212 have been swapped)
Edit: On further inspection, several records indicate 287 as the father and 212 as the mother, specifically:
This may indicate the sex values for 287 and 212 are incorrect (rather than fid and mid being swapped across several records), but you will need to examine your data source (or processing pipeline) to confirm