Informacije

Koje je porijeklo bezvrijedne DNK?

Koje je porijeklo bezvrijedne DNK?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Većina eukariota posjeduje određenu količinu neželjene DNK u jezgri svoje ćelije. Koje je (li) porijeklo (i) ove bezvrijedne DNK, i je li to zaista bezvrijedno (suvišno)?


"Junk DNK" je prikladnije nazvan nekodirajuća DNK. Ovo se definiše kao bilo koja DNK regija koja ne kodira gen ili tačnije nije unutar otvorenog okvira čitanja. U ljudskom genomu preko 98% se sastoji od nekodirajuće DNK. Međutim, što više učimo o molekularnoj biologiji, to više razumijemo biološku funkciju i važnost nekodirajuće DNK. Primjeri važnih funkcija su:

  1. Regulatorne regije koje kontroliraju ekspresiju gena
  2. Regije koje kodiraju regulatornu RNK
  3. Regije u kojima se odvija epigenetska regulacija

Međutim, postoje i regije koje vjerojatno nemaju korisnu biološku funkciju, koje se s pravom mogu nazvati smećem:

  1. Transpozoni su genetske regije koje se mogu kopirati (bilo enzimski aktivnom RNK ili kodiranjem za transpozazu proteina). Vjeruje se da su evoluirali kao "sebični geni" i postoji nekoliko poznatih odbrambenih mehanizama protiv lažnih transposona (siRNA, RNAi). Transpozoni i obrambeni mehanizmi sada su postali moćno oruđe u istraživanju molekularne biologije.
  2. Endogene sekvence retrovirusa koji su ostaci retrovirusa koji su se ubacili u zametnu liniju i postali neaktivni mutacijom.

Međutim, vjeruje se da čak i ove "bezvrijedne" regije imaju važne evolucijske funkcije, kao što je zaštita od mutacije putem retrovirusa: Budući da postoje velike regije DNK gdje precizan redoslijed i funkcija nisu važni, retrovirus koji se ubacuje na nasumične položaje genoma manja je vjerovatnoća da će uzrokovati trajna oštećenja.


Ukratko, znamo za mnoge mehanizme pomoću kojih se genomi mogu povećati. Tetrapodi su imali najmanje dva potpuna udvostručenja genoma u svojoj istoriji; transpozoni se šire; umetci za retroviruse; parcijalne duplikacije dovode do pseudogena. A ti mehanizmi proširenja mogu biti brzi - potpune duplikacije genoma dvostruke veličine u jednoj generaciji.

Ali znamo za vrlo malo mehanizama pomoću kojih se genomi mogu smanjiti, a većina njih je vrlo spora, a vrlo mali broj njih je meta.

S mehanističkog stajališta, vrlo je teško zamisliti ciljani način uklanjanja beskorisne, ali bezopasne DNK brzo, sa 100% točnosti. Ako točnost nije 100%, tada bi put bio štetniji od DNK koju nastoji ukloniti.

Ključno je da ako je dodatni DNK ili bezopasan, ili gotovo bezopasan, nema razloga da se eliminiše, a postoje razlozi (greške u uklanjanju) da se ne pokuša ukloniti.

Dakle, kratak i jednostavan odgovor je da genomi mogu akumulirati beskorisnu DNK mnogo lakše nego što je mogu otarasiti. To je samo zdrav razum, koji odgovara 30 godina eksperimentisanja.


'Neželjena DNK' otkriva prirodu naših starih predaka

Ključ za rješavanje jedne od velikih zagonetki u evolucijskoj biologiji, podrijetlo kralježnjaka - životinja s unutarnjim kosturom napravljenim od kostiju - otkriveno je u novom istraživanju s Dartmouth Collegea i Sveučilišta u Bristolu.

Kičmenjaci su anatomski i genetski najsloženiji od svih organizama, ali objašnjenje kako su postigli ovu složenost uznemirilo je naučnike. Studija, objavljena danas [20. oktobra] u Zbornik radova Nacionalne akademije nauka tvrdi da je riješio ovu naučnu zagonetku analizirajući genomiku primitivnih živih riba poput morskih pasa i svjetiljki, te njihovih rođaka bez kralježnice, poput morskih štrcaljki.  

Alysha Heimberg sa Dartmouth Collegea i kolege proučavali su porodične odnose primitivnih kralježnjaka. Tim je upotrijebio mikroRNK, klasu sićušnih molekula tek nedavno otkrivenih koji se nalaze unutar onoga što se obično smatralo "bezvrijednom DNK", kako bi pokazali da su svjetiljke i jegulje sluzi daleki rođaci kralježaka s čeljustima.

Alysha je rekla: “ Iz naših rezultata saznajemo da su munjavica i morska riba podjednako povezane sa viljušnjacima sa čeljustima i da morske ribe nisu predstavnici primitivnijih kičmenjaka, što sugerira da su predački kralježnjaci bili složeniji nego što je iko ranije mislio.

“Hetenjači su evoluirali stotinama miliona godina, ali i dalje izražavaju iste gene mikroRNA u istim organima kao i kad su se oboje prvi put pojavili. ”

Tim je nastavio s testiranjem ideje da su upravo ti isti geni mikroorganizama odgovorni za evolucijsko podrijetlo anatomskih značajki kralježnjaka. Otkrili su da je isti skup mikroRNK eksprimiran u istim organima i tkivima, u svjetiljkama i miševima.

Koautor, profesor Philip Donoghue sa Fakulteta nauka o Zemlji Univerziteta u Bristolu, rekao je: “Podrijetlo kičmenjaka i porijeklo ovih gena nije slučajno. ”

Profesor Kevin Peterson sa koledža Dartmouth rekao je: “Ova studija ne samo da pokazuje put ka razumijevanju evolucijskog porijekla naše vlastite loze, već nam pomaže i da razumijemo kako je naš vlastiti genom sklopljen u dubokom vremenu. ”


Sadržaj

  1. ^ Pennisi E (septembar 2012). "Genomics. ENCODE projekat piše hvalospjev za neželjenu DNK". Nauka. 337 (6099): 1159–1161. doi: 10.1126/science.337.6099.1159. PMID22955811.
  2. ^
  3. Projektni konzorcij ENCODE (septembar 2012). "Integrirana enciklopedija DNK elemenata u ljudskom genomu". Priroda. 489 (7414): 57–74. Bibcode: 2012Natur.489. 57T. doi: 10.1038/nature11247. PMC3439153 . PMID22955616. .
  4. ^ Greška citiranja: imenovana referenca Costa koja nije kodirana je pozvana, ali nikada nije definirana (pogledajte stranicu pomoći).
  5. ^ ab
  6. Carey M (2015). Neželjena DNK: Putovanje kroz tamnu materiju genoma. Columbia University Press. ISBN9780231170840.
  7. ^
  8. McKie R (24. februar 2013.). "Naučnici napadnuti zbog tvrdnje da je 'junk DNK' vitalna za život." The Observer.
  9. ^
  10. Eddy SR (novembar 2012). "Paradoks C-vrijednosti, bezvrijedna DNK i KOD". Current Biology. 22 (21): R898–9. doi: 10.1016/j.cub.2012.10.002 . PMID23137679. S2CID28289437.
  11. ^
  12. Doolittle WF (april 2013). "Da li je smeće DNK krevet? Kritika ENCODE-a". Zbornik radova Nacionalne akademije nauka Sjedinjenih Američkih Država. 110 (14): 5294–300. Bibcode: 2013PNAS..110.5294D. doi:10.1073/pnas.1221376110. PMC3619371. PMID23479647.
  13. ^
  14. Palazzo AF, Gregory TR (maj 2014). "Slučaj za bezvrijednu DNK". PLOS Genetics. 10 (5): e1004351. doi: 10.1371/journal.pgen.1004351. PMC4014423 . PMID24809441.
  15. ^
  16. Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, Elhaik E (2013). "O besmrtnosti televizora:" funkcija "u ljudskom genomu prema evolucijskom evanđelju ENKODE". Biologija i evolucija genoma. 5 (3): 578–90. doi: 10.1093/gbe/evt028. PMC3622293. PMID23431001.
  17. ^
  18. Ponting CP, Hardison RC (novembar 2011). "Koji je dio ljudskog genoma funkcionalan?". Genome Research. 21 (11): 1769–76. doi: 10.1101/gr.116814.110. PMC3205562. PMID21875934.
  19. ^ ab
  20. Kellis M, Wold B, Snyder MP, Bernstein BE, Kundaje A, Marinov GK, et al. (April 2014). "Definiranje funkcionalnih DNK elemenata u ljudskom genomu". Zbornik radova Nacionalne akademije nauka Sjedinjenih Američkih Država. 111 (17): 6131–8. Bibcode: 2014PNAS..111.6131K. doi: 10.1073/pnas.1318948111. PMC4035993. PMID24753594.
  21. ^
  22. Rands CM, Meader S, Ponting CP, Lunter G (juli 2014.). "8,2% ljudskog genoma je ograničeno: varijacije u stopi fluktuacije kroz klase funkcionalnih elemenata u ljudskoj lozi". PLOS Genetics. 10 (7): e1004525. doi:10.1371/journal.pgen.1004525. PMC4109858. PMID25057982.
  23. ^
  24. Mattick JS (2013). "Opseg funkcionalnosti u ljudskom genomu". Časopis HUGO. 7 (1): 2. doi: 10.1186/1877-6566-7-2. PMC4685169.
  25. ^
  26. Morris K, ur. (2012). Nekodirajuće RNK i epigenetska regulacija ekspresije gena: pokretači prirodne selekcije. Norfolk, UK: Caister Academic Press. ISBN978-1904455943.

Količina ukupne genomske DNK uvelike varira među organizmima, a udio kodirajuće i nekodirajuće DNK unutar ovih genoma također se uvelike razlikuje. Na primjer, prvobitno je sugerirano da više od 98% ljudskog genoma ne kodira proteinske sekvence, uključujući većinu sekvenci unutar introna i većinu međugene DNK, [2] dok 20% tipičnog gena prokariota nije kodirano. [3]

Kod eukariota veličina genoma, a samim tim i količina nekodirane DNK, nije povezana sa složenošću organizma, opažanje poznato kao enigma C-vrijednosti. [4] Na primjer, genom jednostaničnog Polychaos dubium (ranije poznat kao Ameba dubia) je objavljeno da sadrži više od 200 puta veću količinu DNK u ljudi. [5] Pufferfish Takifugu rubripes genom je samo otprilike jedna osmina veličine ljudskog genoma, no čini se da ima uporediv broj gena, približno 90% Takifugu genom je nekodirajuća DNK. [2] Stoga, većina razlika u veličini genoma nije posljedica varijacije u količini kodirajuće DNK, već je posljedica razlike u količini nekodirane DNK. [6]

Godine 2013. otkriven je novi "rekord" za najefikasniji eukariotski genom Utricularia gibba, biljka mjehurića koja ima samo 3% nekodirajuće DNK i 97% kodirajuće DNK. Biljka je izbrisala dijelove nekodirajuće DNA, što je sugeriralo da nekodirajuća DNK možda nije toliko kritična za biljke, iako je nekodirajuća DNK korisna za ljude. [1] Druge studije o biljkama otkrile su ključne funkcije u dijelovima nekodirajuće DNK za koje se ranije smatralo da su zanemarljive i dodale su novi sloj razumijevanju regulacije gena. [7]

Cis- i trans-regulatorni elementi Uredi

Cis-regulatorni elementi su sekvence koje kontrolišu transkripciju obližnjeg gena. Mnogi takvi elementi su uključeni u evoluciju i kontrolu razvoja. [8] Cis-elementi mogu se nalaziti u 5 'ili 3' neprevedenim regijama ili unutar introna. Trans-regulatorni elementi kontroliraju transkripciju udaljenog gena.

Promotori olakšavaju transkripciju određenog gena i obično su uzvodno od kodirajuće regije. Sekvence pojačivača takođe mogu imati veoma udaljene efekte na nivoe transkripcije gena. [9]

Introns Edit

Introni su nekodirajući dijelovi gena, transkribovani u prekursorsku sekvencu mRNA, ali na kraju uklonjeni spajanjem RNK tokom obrade do zrele glasničke RNK. Čini se da su mnogi introni mobilni genetski elementi. [10]

Studije introna prve grupe iz Tetrahimena protozoi ukazuju na to da se čini da su neki introni sebični genetski elementi, neutralni prema domaćinu jer se uklanjaju s bočnih egzona tijekom obrade RNA i ne stvaraju predrasude u ekspresiji između alela sa i bez introna. [10] Čini se da neki introni imaju značajnu biološku funkciju, vjerovatno kroz funkcionalnost ribozima koja može regulirati aktivnost tRNA i rRNA, kao i ekspresiju gena koji kodira proteine, što je vidljivo kod domaćina koji su postali ovisni o takvim intronima tokom dugog vremenskog perioda, na primjer, the trnL-intron nalazi se u svim zelenim biljkama i čini se da je okomito naslijeđen nekoliko milijardi godina, uključujući više od milijardu godina unutar hloroplasta i još 2 do 3 milijarde godina prije u cijanobakterijskim precima kloroplasta. [10]

Pseudogenes Edit

Pseudogeni su DNK sekvence, povezane sa poznatim genima, koje su izgubile sposobnost kodiranja proteina ili se na drugi način više ne eksprimiraju u ćeliji. Pseudogeni nastaju retrotranspozicijom ili genomskim dupliciranjem funkcionalnih gena i postaju "genomski fosili" koji su nefunkcionalni zbog mutacija koje sprječavaju transkripciju gena, na primjer unutar regije promotora gena, ili fatalno mijenjaju translaciju gena, kao što je preuranjeni stop kodoni ili pomaci okvira. [11] Pseudogeni nastali retrotranspozicijom RNA intermedijera poznati su kao prerađeni pseudogeni pseudogeni koji nastaju iz genomskih ostataka dupliciranih gena ili su ostaci inaktiviranih gena neprerađeni pseudogeni. [11] Transpozicije nekoć funkcionalnih mitohondrijskih gena iz citoplazme u jezgru, poznate i kao NUMT, također se kvalificiraju kao jedna vrsta zajedničkog pseudogena. [12] Numti se pojavljuju u mnogim eukariotskim svojtama.

Dok Dolloov zakon sugerira da je gubitak funkcije u pseudogenima vjerovatno trajan, utišani geni mogu zapravo zadržati funkciju nekoliko miliona godina i mogu se "reaktivirati" u sekvence koje kodiraju protein [13], a značajan broj pseudogena se aktivno transkribuje. [11] [14] Budući da se pretpostavlja da se pseudogeni mijenjaju bez evolucijskih ograničenja, mogu poslužiti kao koristan model tipa i učestalosti različitih spontanih genetskih mutacija. [15]

Ponavljanje sekvenci, transpozona i virusnih elemenata Uredi

Transpozoni i retrotranspozoni su pokretni genetski elementi. Retrotranspozonske ponovljene sekvence, koje uključuju dugačke isprekidane nuklearne elemente (LINE) i kratke ubačene nuklearne elemente (SINE), predstavljaju veliki dio genomskih sekvenci u mnogim vrstama. Alu sekvence, klasificirane kao kratki interspergirani nuklearni element, najzastupljeniji su mobilni elementi u ljudskom genomu. Pronađeni su neki primjeri SINE-ova koji vrše kontrolu transkripcije nekih gena koji kodiraju proteine. [16] [17] [18]

Endogene sekvence retrovirusa su proizvod reverzne transkripcije genoma retrovirusa u genome zametnih stanica. Mutacija unutar ovih retro-transkribovanih sekvenci može inaktivirati virusni genom. [19]

Preko 8% ljudskog genoma čine (uglavnom raspadnute) endogene retrovirusne sekvence, kao dio preko 42% frakcije koja je prepoznatljivo izvedena iz retrotranspozona, dok se još 3% može identificirati kao ostaci DNK transpozona. Očekuje se da je velik dio preostale polovice genoma koji je trenutno bez objašnjenog podrijetla podrijetlo u transponiranim elementima koji su bili aktivni toliko davno (> 200 miliona godina) da su ih nasumične mutacije učinile neprepoznatljivima. [20] Varijacije veličine genoma u najmanje dvije vrste biljaka uglavnom su rezultat retrotranspozonskih sekvenci. [21] [22]

Telomeres Edit

Telomere su regije ponavljajuće se DNK na kraju kromosoma, koje pružaju zaštitu od kromosomskog propadanja tijekom replikacije DNK. Nedavna istraživanja pokazala su da telomeri pomažu u održavanju vlastite stabilnosti. RNK koja sadrži telomerno ponavljanje (TERRA) su transkripti izvedeni iz telomera. Pokazalo se da TERRA održava aktivnost telomeraze i produžava krajeve kromosoma. [23]

Izraz "bezvrijedna DNK" postao je popularan 1960 -ih. [24] [25] Prema T. Ryan Gregory, o prirodi bezvrijedne DNK prvi je eksplicitno 1972. raspravljao genomski biolog David Comings, koji je taj izraz primijenio na svu nekodirajuću DNK. [26] Termin je iste godine formalizirao Susumu Ohno [6] koji je primijetio da mutacijsko opterećenje štetnih mutacija postavlja gornju granicu broja funkcionalnih lokusa koji se može očekivati ​​s obzirom na tipičnu stopu mutacije. Ohno je pretpostavio da genomi sisara ne mogu imati više od 30.000 lokusa u selekciji prije nego što bi "cijena" od mutacijskog opterećenja izazvala neizbježan pad kondicije i na kraju izumiranje. Ovo predviđanje ostaje čvrsto, s ljudskim genomom koji sadrži približno (kodiraju proteine) 20.000 gena. Drugi izvor Ohnove teorije bilo je zapažanje da čak i blisko povezane vrste mogu imati široko (redoslijeda veličina) različite veličine genoma, što je 1971. nazvano paradoks C-vrijednosti. [27]

Izraz "bezvrijedna DNK" doveden je u pitanje s obrazloženjem da izaziva snažnu reakciju a priori pretpostavka potpune nefunkcionalnosti, a neki su preporučili korištenje neutralnije terminologije, poput "nekodirajuće DNK". [26] Ipak, "bezvrijedna DNK" ostaje oznaka za dijelove sekvence genoma za koje nije identificirana nikakva uočljiva funkcija i koja se kroz komparativnu genomsku analizu ne pojavljuje pod funkcionalnim ograničenjima koja sugeriraju da sama sekvenca nije pružila nikakvu adaptivnu prednost.

Od kasnih 70-ih postalo je očito da većina nekodirajuće DNK u velikim genomima vodi porijeklo iz sebičnog pojačavanja prenosivih elemenata, o čemu su W. Ford Doolittle i Carmen Sapienza 1980. napisali u časopisu Priroda: "Kada se može pokazati da je za određenu DNK ili klasu DNK nedokazane fenotipske funkcije razvijena strategija (poput transpozicije) koja osigurava njen genomski opstanak, tada nije potrebno drugo objašnjenje za njeno postojanje." [28] Može se očekivati ​​da količina neželjene DNK ovisi o brzini amplifikacije ovih elemenata i brzini gubitka nefunkcionalne DNK. [29] U istom broju od Priroda, Leslie Orgel i Francis Crick napisali su da otpadna DNK ima "malo specifičnosti i prenosi malu ili nikakvu selektivnu prednost organizmu". [30] Termin se uglavnom javlja u popularnoj nauci i na kolokvijalni način u naučnim publikacijama, a sugerisano je da su njegove konotacije mogle odgoditi interesovanje za biološke funkcije nekodirajuće DNK. [31]

Neki dokazi ukazuju na to da su neke sekvence "bezvrijedne DNK" izvori za (buduće) funkcionalne aktivnosti u evoluciji putem egzaptacije izvorno sebične ili nefunkcionalne DNK. [32]

ENCODE Uređivanje projekta

2012. godine, projekt ENCODE, istraživački program podržan od Nacionalnog instituta za istraživanje ljudskog genoma, izvijestio je da je 76% nekodiranih DNK sekvenci ljudskog genoma transkribirano i da je gotovo polovica genoma na neki način dostupna genetskim regulatornim proteinima kao što su transkripcijski faktori. [33] Međutim, prijedlog ENCODE -a da je preko 80% ljudskog genoma biokemijski funkcionalan kritiziran je od strane drugih znanstvenika, [34] koji tvrde da niti dostupnost segmenata genoma transkripcijskim faktorima, niti njihova transkripcija ne jamče da ti segmenti imaju biokemijsku funkciju i da je njihova transkripcija selektivno povoljna. Uostalom, nefunkcionalni dijelovi genoma mogu se transkribirati, s obzirom na to da se transkripcijski faktori tipično vežu za kratke sekvence koje se nalaze (nasumično) po cijelom genomu. [35]

Nadalje, na osnovu mnogo nižih procjena funkcionalnosti prije ENCODE -a genomsko očuvanje procjene po lozama sisavaca. [27] [36] [37] [38] Rasprostranjena transkripcija i spajanje u ljudskom genomu raspravljano je kao još jedan pokazatelj genetske funkcije pored genomske konzervacije koja može propustiti loše očuvane funkcionalne sekvence. [39] Nadalje, veliki dio očigledne bezvrijedne DNK uključen je u epigenetsku regulaciju i čini se da je neophodan za razvoj složenih organizama. [40] [41] [42] Genetski pristupi može propustiti funkcionalne elemente koji se fizički ne manifestiraju na organizmu, evolucijski pristupi imaju poteškoća u korištenju točnih poravnanja više vrsta, budući da se genomi čak i blisko povezanih vrsta značajno razlikuju, i sa biohemijski pristupi, iako imaju visoku ponovljivost, biokemijski potpisi ne označavaju uvijek automatski funkciju. [39] Kellis i dr. primijetio je da je 70% pokrivenosti transkripcije manje od 1 transkripta po ćeliji (i stoga može biti zasnovano na lažnoj transkripciji u pozadini). S druge strane, tvrdili su da 12–15% udjela ljudske DNK može biti pod funkcionalnim ograničenjima, a može biti i podcijenjeno ako se uključe ograničenja specifična za lozu. Na kraju se genetski, evolucijski i biokemijski pristupi mogu koristiti na komplementaran način za identifikaciju regija koje mogu biti funkcionalne u ljudskoj biologiji i bolesti. [39] Neki kritičari su tvrdili da se funkcionalnost može ocijeniti samo u odnosu na odgovarajuću nultu hipotezu. U ovom slučaju, nulta hipoteza bi bila da su ovi dijelovi genoma nefunkcionalni i da imaju svojstva, bilo na osnovu konzervacije ili biohemijske aktivnosti, koja bi se očekivala od takvih regija na osnovu našeg općeg razumijevanja molekularne evolucije i biohemija. Prema ovim kritičarima, dok se ne pokaže da dotični region ima dodatne karakteristike, izvan onoga što se očekuje od nulte hipoteze, trebalo bi ga privremeno označiti kao nefunkcionalno. [43]

Neke nekodirajuće sekvence DNK moraju imati neku važnu biološku funkciju. Na to ukazuju komparativna genomska istraživanja koja izvještavaju o visoko konzerviranim regijama nekodirane DNK, ponekad na vremenskim skalama od stotine miliona godina. To implicira da su ove nekodirane regije pod snažnim evolucijskim pritiskom i pozitivnom selekcijom. [44] Na primjer, u genomima ljudi i miševa, koji su se razlikovali od zajedničkog pretka prije 65–75 miliona godina, sekvence DNK koje kodiraju proteine ​​čine samo oko 20% očuvane DNK, s preostalih 80% očuvane DNK zastupljena u nekodirajućim regijama. [45] Mapiranje veza često identifikuje hromozomske regije povezane sa bolešću bez dokaza o funkcionalnim kodirajućim varijantama gena unutar regije, sugerirajući da genetske varijante koje uzrokuju bolest leže u nekodirajućoj DNK. [45] Značaj nekodirajućih DNK mutacija kod raka istražen je u aprilu 2013. [46]

Nekodirajući genetski polimorfizmi igraju ulogu u osjetljivosti na zarazne bolesti, kao što je hepatitis C. [47] Štaviše, nekodirajući genetski polimorfizmi doprinose osjetljivosti na Ewingov sarkom, agresivni pedijatrijski rak kostiju. [48]

Neke specifične sekvence nekodirane DNK mogu biti značajke bitne za strukturu kromosoma, funkciju centromere i prepoznavanje homolognih kromosoma tokom mejoze. [49]

Prema komparativnom istraživanju više od 300 prokariotskih i preko 30 eukariotskih genoma, čini se da [50] eukariota zahtijeva minimalnu količinu nekodirajuće DNK. Količina se može predvidjeti korištenjem modela rasta za regulatorne genetske mreže, što implicira da je potrebno za regulatorne svrhe. Kod ljudi predviđeni minimum je oko 5% ukupnog genoma.

Preko 10% od 32 genoma sisara može funkcionirati kroz formiranje specifičnih sekundarnih struktura RNK. [51] Studija je uporednom genomikom identificirala kompenzacijske DNK mutacije koje održavaju uparivanje RNA baza, što je karakteristična karakteristika molekula RNA. Preko 80% genomskih regija koje predstavljaju evolucijske dokaze očuvanja RNK strukture ne predstavljaju jaku očuvanost sekvence DNK.

Nekodirajuća DNK može možda poslužiti za smanjenje vjerovatnoće poremećaja gena tokom kromosomskog ukrštanja. [52]

Dokazi iz poligenskih rezultata i GWAS Edit

Studije asocijacija na nivou genoma (GWAS) i analiza mašinskog učenja velikih genomskih skupova podataka doveli su do izgradnje poligenskih prediktora za ljudske osobine kao što su visina, gustina kostiju i mnogi rizici od bolesti. Slični prediktori postoje za biljne i životinjske vrste i koriste se u poljoprivrednom uzgoju. [54] Detaljna genetska arhitektura ljudskih prediktora je analizirana i značajni efekti korišteni u predviđanju povezani su s regijama DNA daleko izvan kodirajućih regija. Udjel varijance koji se računa (tj. Dio prediktivne moći koju je zahvatio prediktor) u kodirajućim i nekodirajućim regijama uvelike varira za različite složene osobine. Na primjer, rizik od fibrilacije atrija i koronarne arterijske bolesti uglavnom se kontrolira varijantama u nekodirajućim regijama (frakcija nekodirajuće varijance preko 70 posto), dok dijabetes i visok kolesterol pokazuju suprotan obrazac (nekodirajuća varijansa otprilike 20-30 posto). ). [53] Individualne razlike među ljudima su očigledno značajno pod uticajem nekodirajućih genetskih lokusa, što je snažan dokaz funkcionalnih efekata. Cijeli genotipi egzoma (tj. Koji sadrže informacije ograničene samo na kodirajuće regije) ne sadrže dovoljno informacija za izgradnju ili čak procjenu poligenskih prediktora za mnoge dobro proučene složene osobine i rizike od bolesti.

U 2013. godini procijenjeno je da, općenito, do 85% GWAS lokusa ima nekodirajuće varijante kao vjerovatnu uzročnu vezu. Varijante su često uobičajene u populacijama i za njih se predviđalo da utiču na rizik od bolesti kroz male fenotipske efekte, za razliku od velikih efekata Mendelovih varijanti. [55]

Neke nekodirajuće DNK sekvence određuju nivoe ekspresije različitih gena, kako onih koji su transkribovani u proteine, tako i onih koji su sami uključeni u regulaciju gena. [56] [57] [58]

Transkripcijski faktori Edit

Neke nekodirajuće DNK sekvence određuju mjesto vezivanja transkripcijskih faktora. [56] Transkripcijski faktor je protein koji se veže za određene nekodirajuće DNK sekvence, kontrolirajući na taj način protok (ili transkripciju) genetskih informacija iz DNA u mRNA. [59] [60]

Operatori Edit

Operater je segment DNK za koji se veže represor. Represor je protein koji se veže za DNK i regulira ekspresiju jednog ili više gena vezanjem za operatora i blokiranjem vezivanja RNA polimeraze za promotor, čime se sprječava transkripcija gena. Ovo blokiranje izražavanja naziva se represijom. [61]

Enhancers Edit

Pojačivač je kratka regija DNK koja se može vezati s proteinima (trans-djelujući faktori), slično kao skup faktora transkripcije, kako bi se poboljšali nivoi transkripcije gena u klasteru gena. [62]

Prigušivači zvuka Edit

Prigušivač je regija DNA koja inaktivira ekspresiju gena kada je vezana regulatornim proteinom. Djeluje na vrlo sličan način kao pojačivači, samo se razlikujući po inaktivaciji gena. [63]

Promoteri Edit

Promotor je regija DNK koja olakšava transkripciju određenog gena kada se transkripcijski faktor veže za njega. Promotori se obično nalaze u blizini gena koje regulišu i uzvodno od njih. [64]

Izolatori Edit

Genetski izolator je granični element koji igra dvije različite uloge u ekspresiji gena, bilo kao kod koji pojačava blokator, ili rijetko kao barijera protiv kondenziranog kromatina. Izolator u DNK sekvenci uporediv je s jezičkim razdjelnikom riječi, kao što je zarez u rečenici, jer izolator označava gdje se poboljšana ili potisnuta sekvenca završava. [65]

Evolution Edit

Podeljene sekvence naizgled nefunkcionalne DNK glavni su dokazi zajedničkog porekla. [66]

Čini se da pseudogene sekvence akumuliraju mutacije brže od kodirajućih sekvenci zbog gubitka selektivnog pritiska. [15] Ovo omogućava stvaranje mutiranih alela koji uključuju nove funkcije koje se mogu pogodovati prirodnom selekcijom, stoga pseudogeni mogu poslužiti kao sirovina za evoluciju i mogu se smatrati "protogenima". [67]

Studija objavljena 2019. godine pokazuje da novi geni (nazvani de novo rođenje gena) može se oblikovati iz nekodirajućih regija. [68] Neke studije sugeriraju da bi barem jedna desetina gena mogla biti napravljena na ovaj način. [68]

Dugoročne korelacije Uredi

Utvrđena je statistička razlika između kodirajućih i nekodirajućih DNK sekvenci. Uočeno je da nukleotidi u nekodirajućim DNK sekvencama pokazuju korelacije zakona moći velikog dometa, dok kodirajuće sekvence to ne čine. [69] [70] [71]

Forenzička antropologija Uredi

Policija ponekad prikuplja DNK kao dokaz u svrhu forenzičke identifikacije. Kako je opisano u Maryland protiv Kinga, odluka Vrhovnog suda SAD -a iz 2013. godine: [72]

Trenutni standard za forenzičko testiranje DNK oslanja se na analizu kromosoma koji se nalaze unutar jezgre svih ljudskih stanica. 'DNK materijal u kromosomima sastoji se od "kodirajućih" i "nekodirajućih" regija. Kodirajuće regije poznate su kao geni i sadrže informacije potrebne ćeliji za stvaranje proteina. . . . Regije koje ne kodiraju proteine. . . nisu direktno povezani sa stvaranjem proteina, [i] nazivaju se "neželjena" DNK. ' Pridjev "smeće" može dovesti u zabludu laika, jer je to u stvari DNK regija koja se sa gotovo sigurnošću koristi za identifikaciju osobe. [72]


Slučaj za neželjenu DNK

Genomi su kao knjige života. No, donedavno su im omoti bili zaključani. Konačno sada možemo otvoriti knjige i pregledavati ih. Ali mi samo skromno razumijemo ono što zapravo vidimo. Još uvijek nismo sigurni koliko naš genom kodira informacije koje su važne za naš opstanak, a koliko samo iskrivljena podloga.

Danas je dobar dan za ulazak u raspravu o tome od čega je genom napravljen, zahvaljujući objavljivanju zanimljivog komentara Alexa Palazza i Ryana Gregoryja u PLOS Genetics. Zove se "Slučaj za neželjenu DNK".

Rasprava o genomu može postati vrtoglava. Smatram da je najbolji protivotrov za vrtoglavicu mala istorija. Ova istorija počinje početkom 1900 -ih.

U to vrijeme, genetičari su znali da mi nosimo gene – faktore koji se prenose s roditelja na potomstvo koji utiču na naša tijela – ali nisu znali od čega su geni napravljeni.

To se promijenilo počevši od 1950-ih. Naučnici su prepoznali da su geni napravljeni od DNK, a zatim su shvatili kako geni oblikuju našu biologiju.

Naša DNK je niz jedinica koje se zovu baze. Naše ćelije čitaju baze u dijelu DNK - genu - i grade molekul zvan RNA s odgovarajućom sekvencom. Ćelije zatim koriste RNK kao vodič za izgradnju proteina. Naša tijela sadrže mnogo različitih proteina koji im daju strukturu i obavljaju poslove poput varenja hrane.

Ali 1950-ih, naučnici su takođe počeli da otkrivaju delove DNK izvan regiona koji kodiraju proteine ​​koji su takođe bili važni. Ovi takozvani regulatorni elementi djelovali su kao prekidači za gene koji kodiraju proteine. Protein koji se spoji na jedan od tih prekidača mogao bi potaknuti ćeliju da napravi puno proteina iz datog gena. Ili bi mogao potpuno isključiti gen.

U međuvremenu, naučnici su takođe nalazili delove DNK u genomu za koje se činilo da nisu ni geni koji kodiraju proteine, niti regulatorni elementi. Tokom 1960-ih, na primjer, Roy Britten i David Kohne pronašli su stotine hiljada ponavljajućih segmenata DNK, od kojih se pokazalo da svaki ima samo nekoliko stotina baza. Mnoge od ovih ponavljajućih sekvenci bile su produkt virusnih dijelova DNK. Ovi komadići „sebične DNK“ napravili su svoje kopije koje su umetnute natrag u genom. Mutacije su ih zatim svele u inertne fragmente.

Drugi naučnici su pronašli dodatne kopije gena koji su imali mutacije koje ih sprečavaju da prave proteine ​​– što je postalo poznato kao pseudogeni.

Ljudski genom, sada znamo, sadrži oko 20.000 gena koji kodiraju proteine. To može zvučati kao mnogo genetskog materijala. Ali on čini samo oko 2 posto genoma. Neke biljke su još ekstremnije. Dok u našim genomima imamo oko 3,2 milijarde baza, luk ima 16 milijardi, uglavnom se sastoji od ponavljajućih sekvenci i DNK nalik virusu.

Ostatak genoma postao je tajanstvena divljina za genetičare. Išli bi na ekspedicije kako bi mapirali nekodirane regije i pokušali shvatiti od čega su sačinjeni.

Pokazalo se da neki segmenti DNK imaju funkcije, čak i ako nisu kodirali proteine ​​ili su služili kao prekidači. Na primjer, ponekad naše stanice stvaraju molekule RNA koje ne služe samo kao predlošci za proteine. Umjesto toga, oni imaju vlastite poslove, poput osjećanja kemikalija u ćeliji. Tako da se i ti dijelovi DNK smatraju genima – samo ne genima koji kodiraju proteine.

Sa istraživanjem genoma došlo je do procvata etiketa, od kojih su neke korišćene na zbunjujuće – a ponekad i nemarno – načine. “Nekodirajuća DNK” postala je skraćenica za DNK koja ne kodira proteine. Ali nekodirajuća DNK bi i dalje mogla imati funkciju, poput isključivanja gena ili proizvodnje korisnih molekula RNK.

Naučnici su se takođe počeli pozivati ​​na "neželjenu DNK". Različiti naučnici koristili su ovaj izraz za različite stvari. Japanski genetičar Susumu Ohno upotrijebio je taj izraz pri razvoju teorije o tome kako DNK mutira. Ohno je zamislio slučajno dupliciranje gena koji kodiraju proteine. Kasnije bi mutacije pogodile nove kopije tih gena. In a few cases, the mutations would give the new gene copies a new function. In most, however, they just killed the gene. He referred to the extra useless copies of genes as junk DNA. Other people used the term to refer broadly to any piece of DNA that didn’t have a function.

And then–like crossing the streams in Ghostbusters–junk DNA and non-coding DNA got mixed up. Sometimes scientists discovered a stretch of non-coding DNA that had a function. They might clip out the segment from the DNA in an egg and find it couldn’t develop properly. BAM!–there was a press release declaring that non-coding DNA had long been dismissed as junk, but lo and behold, non-coding DNA can do something after all.

Given that regulatory elements were discovered in the 1950s (the discovery was recognized with Nobel Prizes), this is just illogical.

Nevertheless, a worthwhile questioned remained: how of the genome had a function? How much was junk?

To Britten and Kohne, the idea that repeating DNA was useless was “repugnant.” Seemingly on aesthetic grounds, they preferred the idea that it had a function that hadn’t been discovered yet.

Others, however, argued that repeating DNA (and pseudogenes and so on) were just junk–vast vestiges of disabled genetic material that we carry down through the generations. If the genome was mostly functional, then it was hard to see why it takes five times more functional DNA to make an onion than a human–or to explain the huge range of genome sizes:

In recent years, a consortium of scientists carried out a project called the Encyclopedia of DNA Elements (ENCODE for short) to classify all the parts of the genome. To see if non-coding DNA was functional, they checked for proteins that were attached to them–possibly switching on regulatory elements. They found a lot of them.

“These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions,” they reported.

Nauka translated that conclusion into a headline, “ENCODE Project writes eulogy for junk DNA.”

A lot of defenders of junk have attacked this conclusion–or, to be more specific, how the research got translated into press releases and then into news articles. In their new review, Palazzo and Gregory present some of the main objections.

Just because proteins grab onto a piece of DNA, for example, doesn’t actually mean that there’s a gene nearby that is going to make something useful. It could just happen to have the right sequence to make the proteins stick to it.

And even if a segment of DNA does give rise to RNA, that RNA may not have a function. The cell may accidentally make RNA molecules, which they then chop up.

If I had to guess why Britten and Kohne found junk DNA repugnant, it probably had to do with evolution. Darwin, after all, had shown how natural selection can transform a population, and how, over millions of years, it could produce adaptations. In the 1900s, geneticists turned his idea into a modern theory. Genes that boosted reproduction could become more common, while ones that didn’t could be eliminated from a population. You’d expect that natural selection would have left the genome mostly full of functional stuff.

Palazzo and Gregory, on the other hand, argue that evolution treba produce junk. The reason has to do with the fact that natural selection can be quite weak in some situations. The smaller a population gets, the less effective natural selection is at favoring beneficial mutations. In small populations, a mutation can spread even if it’s not beneficial. And compared to bacteria, the population of humans is very small. (Technically speaking, it’s the “effective population size” that’s small–follow the link for an explanation of the difference.) When non-functional DNA builds up in our genome, it’s harder for natural selection to strip it out than if we were bacteria.

While junk is expected, a junk-free genome is not. Palazzo and Gregory based this claim on a concept with an awesome name: mutational meltdown.

Here’s how it works. A population of, say, frogs is reproducing. Every time they produce a new tadpole, that tadpole gains a certain number of mutations. A few of those mutations may be beneficial. The rest will be neutral or harmful. If harmful mutations emerge at a rate that’s too fast for natural selection to weed them out, they’ll start to pile up in the genome. Overall, the population will get sicker, producing fewer offspring. Eventually the mutations will drive the whole population to extinction.

Mutational meltdown puts an upper limit on how many genes an organism can have. If a frog has 10,000 genes, those are 10,000 potential targets for a harmful mutation. If the frog has 100,000 genes, it has ten times more targets.

Estimates of the human mutation rate suggest that somewhere between 70 to 150 new mutations strike the genome of every baby. Based on the risk of mutational meltdown, Palazzo and Gregory estimate that only ten percent of the human genome can be functional.* The other ninety percent must be junk DNA. If a mutation alters junk DNA, it doesn’t do any harm because the junk isn’t doing us any good to begin with. If our genome was 80 percent functional–the figure batted around when the ENCODE project results first came out–then we should be extinct.

It may sound wishy-washy for me to say this, but the junk DNA debates will probably settle somewhere in between the two extremes. Is the entire genome functional? No. Is everything aside from protein-coding genes junk? No–we’ve already known that non-coding DNA can be functional for over 50 years. Even if “only” ten percent of the genome turns out to be functional, that’s a huge collection of DNA. It’s six times bigger than the DNA found in all our protein-coding genes. There could be thousands of RNA molecules scientists have yet to understand.

Even if ninety percent of the genome does prove to be junk, that doesn’t mean the junk hasn’t played a role in our evolution. As I wrote last week in the New York Times, it’s from these non-coding regions that many new protein-coding genes evolve. What’s more, much of our genome is made up of viruses, and every now and then evolution has, in effect, harnessed those viral genes to carry out a job for our own bodies. The junk is a part of us, and it, too, helps to make us what we are.

*I mean functional in terms of its sequence. The DNA might still do something important structurally–helping the molecule bend in a particular way, for example.

[Update: Fixed caption. Tweaked the last paragraph to clarify that it’s not a case of teleology.]


Glossary

DNA: Deoxyribonucleic acid is the chemical that stores genetic information in our cells. Shaped like a double helix, DNA passes down from one generation to the next.

RNA: Ribonucleic acid is a type of molecule used in making proteins in the body.

Genome: The complete genetic makeup of an organism, which contains all the biological information to build and keep it alive.

Gene: A stretch of DNA that tells a cell how to make specific proteins or RNA molecules.

Enzyme: A molecule that promotes a chemical reaction inside a living organism.

Stem cell: A biological master cell that can multiply and become many different types of tissue. They can also replicate to make more stem cells.


Functions for the Useless

Nearly a decade after the completion of the Human Genome Project, which gave us the first full read of our genetic script at the start of the century, a team of over 400 scientists released what they called the Encyclopedia of DNA Elements , or ENCODE for short. The international collaboration explored the function of every letter in the genome. The results of the massive undertaking called for a reassessment of junk DNA. Though less than two percent of the genome makes proteins, around 80 percent carries out some sort of function.

What fell into ENCODE’s definition of functionality was pretty broad, however. Any “biochemical activity” was fair game — getting transcribed into RNA, even if chopped later in the process, qualified sequences as functional. But many of the “junk” sections do have important roles, including regulating how DNA is transcribed and translated from there into proteins. If protein-coding sequences are the notes of a symphony, then some of the non-coding sequences act like the conductor, influencing the pace and repetitions of the masterpiece.

But not every bit of junk DNA might have a functional use. In a study published in Molecular Biology of the Cell in 2008, scientists cleaned junk DNA from yeast’s genome. For particular genes, they got rid of introns — the sections that get chopped away after DNA transcription. They reported the intron removal had no significant consequences for the cells under laboratory conditions, supporting the notion that they don’t have any function.

But studies published in Nature this year argued otherwise. When food is scarce, researchers found these sequences are essential for yeast survival. The usefulness of these introns might depend on the context, these studies argue — still a far cry from being junk.


Research team finds important role for junk DNA

Scientists have called it "junk DNA." They have long been perplexed by these extensive strands of genetic material that dominate the genome but seem to lack specific functions. Why would nature force the genome to carry so much excess baggage?

Now researchers from Princeton University and Indiana University who have been studying the genome of a pond organism have found that junk DNA may not be so junky after all. They have discovered that DNA sequences from regions of what had been viewed as the "dispensable genome" are actually performing functions that are central for the organism. They have concluded that the genes spur an almost acrobatic rearrangement of the entire genome that is necessary for the organism to grow.

It all happens very quickly. Genes called transposons in the single-celled pond-dwelling organism Oxytricha produce cell proteins known as transposases. During development, the transposons appear to first influence hundreds of thousands of DNA pieces to regroup. Then, when no longer needed, the organism cleverly erases the transposases from its genetic material, paring its genome to a slim 5 percent of its original load.

Laura Landweber (Photo: Denise Applewhite)

"The transposons actually perform a central role for the cell," said Laura Landweber, a professor of ecology and evolutionary biology at Princeton and an author of the study. "They stitch together the genes in working form." The work appeared in the May 15 edition of Science.

In order to prove that the transposons have this reassembly function, the scientists disabled several thousand of these genes in some Oxytricha. The organisms with the altered DNA, they found, failed to develop properly.

Other authors from Princeton's Department of Ecology and Evolutionary Biology include: postdoctoral fellows Mariusz Nowacki and Brian Higgins 2006 alumna Genevieve Maquilan and graduate student Estienne Swart. Former Princeton postdoctoral fellow Thomas Doak, now of Indiana University, also contributed to the study.

Landweber and other members of her team are researching the origin and evolution of genes and genome rearrangement, with particular focus on Oxytricha because it undergoes massive genome reorganization during development.

In her lab, Landweber studies the evolutionary origin of novel genetic systems such as Oxytricha's. By combining molecular, evolutionary, theoretical and synthetic biology, Landweber and colleagues last year discovered an RNA (ribonucleic acid)-guided mechanism underlying its complex genome rearrangements.

"Last year, we found the instruction book for how to put this genome back together again -- the instruction set comes in the form of RNA that is passed briefly from parent to offspring and these maternal RNAs provide templates for the rearrangement process," Landweber said. "Now we've been studying the actual machinery involved in the process of cutting and splicing tremendous amounts of DNA. Transposons are very good at that."

The term "junk DNA" was originally coined to refer to a region of DNA that contained no genetic information. Scientists are beginning to find, however, that much of this so-called junk plays important roles in the regulation of gene activity. No one yet knows how extensive that role may be.

Instead, scientists sometimes refer to these regions as "selfish DNA" if they make no specific contribution to the reproductive success of the host organism. Like a computer virus that copies itself ad nauseum, selfish DNA replicates and passes from parent to offspring for the sole benefit of the DNA itself. The present study suggests that some selfish DNA transposons can instead confer an important role to their hosts, thereby establishing themselves as long-term residents of the genome.


Is 75% of the Human Genome Junk DNA?

By the rude bridge that arched the flood,
Their flag to April’s breeze unfurled,
Here once the embattled farmers stood,
And fired the shot heard round the world.

–Ralph Waldo Emerson, Concord Hymn

Emerson referred to the Battles of Lexington and Concord, the first skirmishes of the Revolutionary War, as the “shot heard round the world.”

While not as loud as the gunfire that triggered the Revolutionary War, a recent article published in Genome Biology and Evolution by evolutionary biologist Dan Graur has garnered a lot of attention, 1 serving as the latest salvo in the junk DNA wars—a conflict between genomics scientists and evolutionary biologists about the amount of functional DNA sequences in the human genome.

Clearly, this conflict has important scientific ramifications, as researchers strive to understand the human genome and seek to identify the genetic basis for diseases. The functional content of the human genome also has significant implications for creation-evolution skirmishes. If most of the human genome turns out to be junk after all, then the case for a Creator potentially suffers collateral damage.

According to Graur, no more than 25% of the human genome is functional—a much lower percentage than reported by the ENCODE Consortium. Released in September 2012, phase II results of the ENCODE project indicated that 80% of the human genome is functional, with the expectation that the percentage of functional DNA in the genome would rise toward 100% when phase III of the project reached completion.

If true, Graur’s claim would represent a serious blow to the validity of the ENCODE project conclusions and devastate the RTB human origins creation model. Intelligent design proponents and creationists (like me) have heralded the results of the ENCODE project as critical in our response to the junk DNA challenge.

Junk DNA and the Creation vs. Evolution Battle

Evolutionary biologists have long considered the presence of junk DNA in genomes as one of the most potent pieces of evidence for biological evolution. Skeptics ask, “Why would a Creator purposely introduce identical nonfunctional DNA sequences at the same locations in the genomes of different, though seemingly related, organisms?”

When the draft sequence was first published in 2000, researchers thought only around 2–5% of the human genome consisted of functional sequences, with the rest being junk. Numerous skeptics and evolutionary biologists claim that such a vast amount of junk DNA in the human genome is compelling evidence for evolution and the most potent challenge against intelligent design/creationism.

But these arguments evaporate in the wake of the ENCODE project. If valid, the ENCODE results would radically alter our view of the human genome. No longer could the human genome be regarded as a wasteland of junk rather, the human genome would have to be recognized as an elegantly designed system that displays sophistication far beyond what most evolutionary biologists ever imagined.

ENCODE Skeptics

The findings of the ENCODE project have been criticized by some evolutionary biologists who have cited several technical problems with the study design and the interpretation of the results. (See articles listed under “Resources to Go Deeper” for a detailed description of these complaints and my responses.) But ultimately, their criticisms appear to be motivated by an overarching concern: if the ENCODE results stand, then it means key features of the evolutionary paradigm can’t be correct.

Calculating the Percentage of Functional DNA in the Human Genome

Graur (perhaps the foremost critic of the ENCODE project) has tried to discredit the ENCODE findings by demonstrating that they are incompatible with evolutionary theory. Toward this end, he has developed a mathematical model to calculate the percentage of functional DNA in the human genome based on mutational load—the amount of deleterious mutations harbored by the human genome.

Graur argues that junk DNA functions as a “ sponge ” absorbing deleterious mutations, thereby protecting functional regions of the genome. Considering this buffering effect, Graur wanted to know how much junk DNA must exist in the human genome to buffer against the loss of fitness—which would result from deleterious mutations in functional DNA—so that a constant population size can be maintained.

Historically, the replacement level fertility rates for human beings have been two to three children per couple. Based on Graur’s modeling, this fertility rate requires 85–90% of the human genome to be composed of junk DNA in order to absorb deleterious mutations—ensuring a constant population size, with the upper limit of functional DNA capped at 25%.

Graur also calculated a fertility rate of 15 children per couple, at minimum, to maintain a constant population size, assuming 80% of the human genome is functional. According to Graur’s calculations, if 100% of the human genome displayed function, the minimum replacement level fertility rate would have to be 24 children per couple.

He argues that both conclusions are unreasonable. On this basis, therefore, he concludes that the ENCODE results cannot be correct.

Response to Graur

So, has Graur’s work invalidated the ENCODE project results? Teško. Here are four reasons why I’m skeptical.

1. Graur’s estimate of the functional content of the human genome is based on mathematical modeling, not experimental results.

An adage I heard repeatedly in graduate school applies: “Theories guide, experiments decide.” Though the ENCODE project results theoretically don’t make sense in light of the evolutionary paradigm, that is not a reason to consider them invalid. A growing number of studies provide independent eksperimentalno validation of the ENCODE conclusions. (Go here and here for two recent examples.)

To question experimental results because they don’t align with a theory’s predictions is a “ Bizarro World ” approach to science. Experimental results and observations determine a theory’s validity, not the other way around. Yet when it comes to the ENCODE project, its conclusions seem to be weighed based on their conformity to evolutionary theory. Simply put, ENCODE skeptics are doing science backwards.

While Graur and other evolutionary biologists argue that the ENCODE results don’t make sense from an evolutionary standpoint, I would argue as a biochemist that the high percentage of functional regions in the human genome makes perfect sense. The ENCODE project determined that a significant fraction of the human genome is transcribed. They also measured high levels of protein binding.

ENCODE skeptics argue that this biochemical activity is merely biochemical noise. But this assertion does not make sense because (1) biochemical noise costs energy and (2) random interactions between proteins and the genome would be harmful to the organism.

Transcription is an energy- and resource-intensive process. To believe that most transcripts are merely biochemical noise would be untenable. Such a view ignores cellular energetics. Transcribing a large percentage of the genome when most of the transcripts serve no useful function would routinely waste a significant amount of the organism’s energy and material stores. If such an inefficient practice existed, surely natural selection would eliminate it and streamline transcription to produce transcripts that contribute to the organism’s fitness.

Apart from energetics considerations, this argument ignores the fact that random protein binding would make a dire mess of genome operations. Without minimizing these disruptive interactions, biochemical processes in the cell would grind to a halt. It is reasonable to think that the same considerations would apply to transcription factor binding with DNA.

2. Graur’s model employs some questionable assumptions.

Graur uses an unrealistically high rate for deleterious mutations in his calculations.

Graur determined the deleterious mutation rate using protein-coding genes. These DNA sequences are highly sensitive to mutations. In contrast, other regions of the genome that display function—such as those that (1) dictate the three-dimensional structure of chromosomes, (2) serve as transcription factors, and (3) aid as histone binding sites—are much more tolerant to mutations. Ignoring these sequences in the modeling work artificially increases the amount of required junk DNA to maintain a constant population size.

3. The way Graur determines if DNA sequence elements are functional is questionable.

Graur uses the selected-effect definition of function. According to this definition, a DNA sequence is only functional if it is undergoing negative selection. In other words, sequences in genomes can be deemed functional samo if they evolved under evolutionary processes to perform a particular function. Once evolved, these sequences, if they are functional, will resist evolutionary change (due to natural selection) because any alteration would compromise the function of the sequence and endanger the organism. If deleterious, the sequence variations would be eliminated from the population due to the reduced survivability and reproductive success of organisms possessing those variants. Hence, functional sequences are those under the effects of selection.

In contrast, the ENCODE project employed a causal definition of function. Accordingly, function is ascribed to sequences that play some observationally or experimentally determined role in genome structure and/or function.

The ENCODE project focused on experimentally determining which sequences in the human genome displayed biochemical activity using assays that measured

  • transcription,
  • binding of transcription factors to DNA,
  • histone binding to DNA,
  • DNA binding by modified histones,
  • DNA methylation, and
  • three-dimensional interactions between enhancer sequences and genes.

In other words, if a sequence is involved in any of these processes—all of which play well-established roles in gene regulation—then the sequences must have functional utility. That is, if sequence P performs function G, then sequence P is functional.

So why does Graur insist on a selected-effect definition of function? For no other reason than a causal definition ignores the evolutionary framework when determining function. He insists that function be defined exclusively within the context of the evolutionary paradigm. In other words, his preference for defining function has more to do with philosophical concerns than scientific ones—and with a deep-seated commitment to the evolutionary paradigm.

As a biochemist, I am troubled by the selected-effect definition of function because it is theory-dependent. In science, cause-and-effect relationships (which include biological and biochemical function) need to be established experimentally and observationally, independent of any particular theory. Once these relationships are determined, they can then be used to evaluate the theories at hand. Do the theories predict (or at least accommodate) the established cause-and-effect relationships, or not?

Using a theory-dependent approach poses the very real danger that experimentally determined cause-and-effect relationships (or, in this case, biological functions) will be discarded if they don’t fit the theory. And, again, it should be the other way around. A theory should be discarded, or at least reevaluated, if its predictions don’t match these relationships.

What difference does it make which definition of function Graur uses in his model? A big difference. The selected-effect definition is more restrictive than the causal-role definition. This restrictiveness translates into overlooked function and increases the replacement level fertility rate.

4. Buffering against deleterious mutations is a function.

As part of his model, Graur argues that junk DNA is necessary in the human genome to buffer against deleterious mutations. By adopting this view, Graur has inadvertently identified function for junk DNA. In fact, he is not the first to argue along these lines. Biologist Claudiu Bandea has posited that high levels of junk DNA can make genomes resistant to the deleterious effects of transposon insertion events in the genome. If insertion events are random, then the offending DNA is much more likely to insert itself into “junk DNA” regions instead of coding and regulatory sequences, thus protecting information-harboring regions of the genome.

If the last decade of work in genomics has taught us anything, it is this: we are in our infancy when it comes to understanding the human genome. The more we learn about this amazingly complex biochemical system, the more elegant and sophisticated it becomes. Through this process of discovery, we continue to identify functional regions of the genome—DNA sequences long thought to be “ junk. ”

In short, the criticisms of the ENCODE project reflect a deep-seated commitment to the evolutionary paradigm and, bluntly, are at war with the experimental facts.

Bottom line: if the ENCODE results stand, it means that key aspects of the evolutionary paradigm can’t be correct.


Perennial Problem of C-Value

Information and Structure.

The junk idea long predates genomics and since its early decades has been grounded in the “C-value paradox,” the observation that DNA amounts (C-value denotes haploid nuclear DNA content) and complexities correlate very poorly with organismal complexity or evolutionary “advancement” (10 ⇓ ⇓ ⇓ –14). Humans do have a thousand times as much DNA as simple bacteria, but lungfish have at least 30 times more than humans, as do many flowering plants and some unicellular protists (14). Moreover, as is often noted, the disconnection between C-value and organismal complexity is also found within more restricted groups comprising organisms of seemingly similar lifestyle and comparable organismal or behavioral complexity. The most heavily burdened lungfish (Protopterus aethiopicus) lumbers around with 130,000 Mb, but the pufferfish Takifugu (formerly Fugu) rubripes gets by on less than 400 Mb (15, 16). A less familiar but better (because monophyletic) animal example might be amphibians, showing a 120-fold range from frogs to salamanders (17). Among angiosperms, there is a thousandfold variation (14). Additionally, even within a single genus, there can be substantial differences. Salamander species belonging to Pletodon boast a fourfold range, to cite a comparative study popular from the 1970s (18). Sometimes, such within-genus genome size differences reflect large-scale or whole-genome duplications and sometimes rampant selfish DNA or transposable element (TE) multiplication. Schnable et al. (19) figure that the maize genome has more than doubled in size in the last 3 million y, overwhelmingly through the replication and accumulation of TEs for example. If we do not think of this additional or “excess” DNA, so manifest through comparisons between and within biological groups, as junk (irrelevant if not frankly detrimental to the survival and reproduction of the organism bearing it), how then are we to think of it?

Of course, DNA inevitably does have a basic structural role to play, unlinked to specific biochemical activities or the encoding of information relevant to genes and their expression. Centromeres and telomeres exemplify noncoding chromosomal components with specific functions. More generally, DNA as a macromolecule bulks up and gives shape to chromosomes and thus, as many studies show, determines important nuclear and cellular parameters such as division time and size, themselves coupled to organismal development (11 ⇓ –13, 17). The “selfish DNA” scenarios of 1980 (20 ⇓ –22), in which C-value represents only the outcome of conflicts between upward pressure from reproductively competing TEs and downward-directed energetic restraints, have thus, in subsequent decades, yielded to more nuanced understandings. Cavalier-Smith (13, 20) called DNA’s structural and cell biological roles “nucleoskeletal,” considering C-value to be optimized by organism-level natural selection (13, 20). Gregory, now the principal C-value theorist, embraces a more “pluralistic, hierarchical approach” to what he calls “nucleotypic” function (11, 12, 17). A balance between organism-level selection on nuclear structure and cell size, cell division times and developmental rate, selfish genome-level selection favoring replicative expansion, and (as discussed below) supraorganismal (clade-level) selective processes—as well as drift—must all be taken into account.

These forces will play out differently in different taxa. González and Petrov (23) point out, for instance, that Drosophila and humans are at opposite extremes in terms of the balance of processes, with the minimalist genomes of the former containing few (but mostly young and quite active) TEs, whereas at least one-half of our own much larger genome comprises the moribund remains of older TEs, principally SINEs and LINEs (short and long interspersed nuclear elements). Such difference may in part reflect population size. As Lynch notes, small population size (characteristic of our species) will have limited the effectiveness of natural selection in preventing a deleterious accumulation of TEs (24, 25).

Zuckerkandl (26) once mused that all genomic DNA must be to some degree “polite,” in that it must not lethally interfere with gene expression. Indeed, some might suggest, as I will below, that true junk might better be defined as DNA not currently held to account by selection for any sort of role operating at any level of the biological hierarchy (27). However, junk advocates have to date generally considered that even DNA fulfilling bulk structural roles remains, in terms of encoded information, just junk. Cell biology may require a certain C-value, but most of the stretches of noncoding DNA that go to satisfying that requirement are junk (or worse, selfish).

In any case, structural roles or multilevel selection theorizing are not what ENCODE commentators are endorsing when they proclaim the end of junk, touting the existence of 4 million gene switches or myriad elements that determine gene expression and assigning biochemical functions for 80% of the genome. Indeed, there would be no excitement in either the press or the scientific literature if all the ENCODE team had done was acknowledge an established theory concerning DNA’s structural importance. Rather, the excitement comes from interpreting ENCODE’s data to mean that a much larger fraction of our DNA than until very recently thought contributes to our survival and reproduction as organisms, because it encodes information transcribed or expressed phenotypically in one tissue or another, or specifically regulates such expression.

A Thought Experiment.

ENCODE (5) defines a functional element (FE) as “a discrete genome segment that encodes a defined product (for example, protein or non-coding RNA) or displays a reproducible biochemical signature (for example, protein binding, or a specific chromatin structure).” A simple thought experiment involving FEs so-defined is at the heart of my argument.

Suppose that there had been (and probably, some day, there will be) ENCODE projects aimed at enumerating, by transcriptional and chromatin mapping, factor footprinting, and so forth, all of the FEs in the genomes of Takifugu and a lungfish, some small and large genomed amphibians (including several species of Pletodon), plants, and various protists. There are, I think, two possible general outcomes of this thought experiment, neither of which would give us clear license to abandon junk.

The first outcome would be that FEs (estimated to be in the millions in our genome) turn out to be more or less constant in number, regardless of C-value—at least among similarly complex organisms. If larger C-value by itself does not imply more FEs, then there will, of course, be great differences in what we might call functional density (FEs per kilobase) (26) among species. FEs spaced by kilobases in Arabidopsis would be megabases apart in maize on average. Averages obscure details: the extra DNA in the larger genomes might be sequestered in a few giant silent regions rather than uniformly stretching out the space between FEs or lengthening intragenic introns. However, in either case, this DNA could be seen as a sort of polite functionless filler or diluent. At best, such DNA might have functions only of the structural or nucleoskeletal/nucleotypic sort. Indeed, even this sort of functional attribution is not necessary. There is room within an expanded, pluralistic and hierarchical theory of C-value (see below) (12, 27) for much DNA that makes no contribution whatever to survival and reproduction at the organismal level and thus is junk at that level, although it may be under selection at the sub- or supraorganismal levels (TEs and clade selection).

If the human genome is junk-free, then it must be very luckily poised at some sort of minimal size for organisms of human complexity. We may no longer think that mankind is at the center of the universe, but we still consider our species’ genome to be unique, first among many in having made such full and efficient use of all of its millions of SINES and LINES (retrotransposable elements) and introns to encode the multitudes of lncRNAs and house the millions of enhancers necessary to make us the uniquely complex creatures that we believe ourselves to be. However, were this extraordinary coincidence the case, a corollary would be that junk would not be defunct for many other larger genomes: the term would not need to be expunged from the genomicist’s lexicon more generally. As well, if, as is commonly believed, much of the functional complexity of the human genome is to be explained by evolution of our extraordinary cognitive capacities, then many other mammals of lesser acumen but similar C-value must truly have junk in their DNA.

The second likely general outcome of my thought experiment would be that FEs as defined by ENCODE increase in number with C-value, regardless of apparent organismal complexity. If they increase roughly proportionately, FE numbers will vary over a many-hundredfold range among organisms normally thought to be similarly complex. Defining or measuring complexity is, of course, problematic if not impossible. Still, it would be hard to convince ourselves that lungfish are 300 times more complex than Takifugu or 40 times more complex than us, whatever complexity might be. More likely, if indeed FE numbers turn out to increase with C-value, we will decide that we need to think again about what function is, how it becomes embedded in macromolecular structures, and what FEs as defined by ENCODE have to tell us about it.


What's the origin of junk DNA? - Biologija

NIST-led Research De-Mystifies Origins Of 'Junk' DNA

One man's junk, is another's treasure
Washington - Mar 26, 2004
A debate over the origins of what is sometimes called "junk" DNA has been settled by research involving scientists at the Center for Advanced Research in Biotechnology (CARB) and a collaborator, who developed rigorous proof that these mysterious sections were added to DNA "late" in the evolution of life on earth--after the formation of modern-sized genes, which contain instructions for making proteins.

A biologist with the Commerce Department's National Institute of Standards and Technology (NIST) led the research team, which reported its findings in the March 10 online edition of Molecular Biology and Evolution.

The results are based on a systematic, statistically rigorous analysis of publicly available genetic data carried out with bioinformatics software developed at CARB.

In humans, there is so much apparent "junk" DNA (sections of the genome with no known function) that it takes up more space than the functional parts. Much of this junk consists of "introns," which appear as interruptions plopped down in the middle of genes.

Discovered in the 1970s, introns mystify scientists but are readily accounted for by cells: when the cellular machinery transcribes a gene in preparation for making a protein, introns are simply spliced out of the transcript.

Research from the CARB group appears to resolve a debate over the "early versus late" timing of the appearance of introns. Since introns were discovered in 1978, scientists have debated whether genes were born split (the "introns-early" view), or whether they became split after eukaryotic cells (the ones that gave rise to animals and their relatives) diverged from bacteria roughly 2 billion years ago (the "introns-late" view).

Bacterial genomes lack introns. Although the study did not attempt to propose a function for introns, or determine whether they are beneficial or harmful, the results appear to rule out the "introns-early" view.

The CARB analysis shows that the probability of a modern intron's presence in an ancestral gene common to the genes studied is roughly 1 percent, indicating that the vast majority of today's introns appeared subsequent to the origin of the genes.

This conclusion is supported by the findings regarding placement patterns for introns within genes. It long has been observed that, in the sequences of nitrogen-containing compounds that make up our DNA genomes, introns prefer some sites more than others. The CARB study indicates that these preferences are side effects of late-stage intron gain, rather than side effects of intron-mediated gene formation.

The CARB results are based on an analysis of carefully processed data for 10 families of protein-coding genes in animals, plants, fungi and their relatives (see sidebar for details of the method used). A variety of statistical modeling, theoretical, and automated analytical approaches were used while most were conventional, their combined application to the study of introns was novel.

The CARB study also is unique in using an evolutionary model as the basis for inferring the presence of ancestral introns. The research was made possible in part by the increasing availability, over the past decade, of massive amounts of genetic sequence data.

The lead researcher is Arlin B. Stoltzfus of NIST collaborators include Wei-Gang Qiu, formerly of CARB and the University of Mayland and now at Hunter College in New York City, and Nick Schisler, currently at Furman University, Greenville, S.C.

CARB is a cooperative venture of NIST and the University of Maryland Biotechnology Institute.

CARB's Approach to Understanding the Origins of 'Junk' DNA

Scientists long have compared the sequences of chemical compounds in different proteins, genes and entire genomes to derive clues about structure and function.

The most sophisticated comparative methods are evolutionary and rely on matching similar sequences from different organisms, inferring family trees to determine relationships, and reconstructing changes that must have occurred to create biologically relevant differences.

This type of analysis is usually done with one sequence family at a time. The Center for Advanced Research in Biotechnology (CARB), a cooperative venture of the Commerce Department's National Institute of Standards and Technology (NIST) and the University of Maryland Biotechnology Institute, developed software to automate the analysis of dozens--and perhaps hundreds, eventually--of sequence families at a time.

The automated methods also assess the reliability of all the information, so that conclusions are based on the most reliable parts of the analysis.

The CARB method has two parts. The first part consists of a combination of manual and automated processing of gene data from public databases. The data are clustered into families through matching of similar sequences, first in pairs and then in groups.

Then family trees are developed indicating how the genes are related to each other. A file is developed for each family that includes data on sequence matches, intron locations, family trees and reliability measures.

These datasets then are loaded into the second part of the system, which is fully automated. It consist of a relational database combined with software that computes probabilities for introns being present in ancestral genes using a method developed at CARB.

Each gene is assigned to a kingdom (plants, animals, fungi and others), and a matrix of intron presence/absence data is determined for each family based on the sequence alignments. This matrix, along with the family tree, is used to estimate ancestral states of introns, as well as rates of intron loss and gain. Additional software is used for analysis and visualization of results.

The CARB study analyzed data for 10 families of protein-coding genes in multi-celled organisms, encompassing 1,868 introns at 488 different positions.

Life-Seeking Chip Will Join Space Probes
Pasadena (UPI) Mar 23, 2004
U.S. scientists said Tuesday they have developed a miniature laboratory that can spot a tell-tale chemical signature of life.

With the rise of Ad Blockers, and Facebook - our traditional revenue sources via quality network advertising continues to decline. And unlike so many other news sites, we don't have a paywall - with those annoying usernames and passwords.